ログイン

MIDTERM-ADVANCED DBM C3 L2
97問 • 1年前
  • Jamaica Rose Gilo
  • 通報

    問題一覧

  • 1

    fundamental technique in data mining used to classify data into different categories based on a set of predefined rules

    Rule-Based Classification

  • 2

    Rules used in Rule-Based Classification are often derived from what? making the process both intuitive and interpretable

    Data itself

  • 3

    supervised learning task where a model learns to map input data to a specific category or label

    Classification

  • 4

    The goal is to predict the class of unseen instances based on what was learned from the training data.

    Classification

  • 5

    takes the form: IF condition THEN class

    Rules

  • 6

    can be applied sequentially to classify data

    Rule Set

  • 7

    Rules are often ordered by a specific priority or confidence level.

    Rule Set

  • 8

    How Rule-Based Classification Works?

    Rule Generation, Rule Evaluation, Rule Matching, Conflict Resolution

  • 9

    Rules are generated from the training data based on the relationships between different features and the target class.

    Rule Generation

  • 10

    Techniques such as decision trees (e.g., ID3, C4.5), association rule mining (e.g., Apriori), or direct heuristics (e.g., covering algorithms) can be used to extract rules

    Rule Generation

  • 11

    Rules that correctly classify more instances with high accuracy are generally preferred.

    Rule Evaluation

  • 12

    Each rule is evaluated based on its accuracy or coverage.

    Rule Evaluation

  • 13

    When classifying a new instance, the rule-based system evaluates which rules match the instance’s attributes.

    Rule Matching

  • 14

    In some cases, multiple rules might apply to a single instance, leading to conflicts.

    Conflict Resolution

  • 15

    Conflict Resolution Strategies

    Rule Priority, Voting

  • 16

    Several rules contribute votes to different classes, and the majority vote determines the class.

    Voting

  • 17

    More specific rules or rules with higher confidence may take precedence

    Rule Priority

  • 18

    These methods generate classification rules directly from the data without creating an intermediate model.

    Direct Methods

  • 19

    an efficient algorithm for generating classification rules

    RIPPER

  • 20

    RIPPER stands for?

    Repeated Incremental Pruning to Produce Error Reduction

  • 21

    This is a simple algorithm that generates rules based on a single attribute at a time.

    OneR

  • 22

    OneR stands for?

    One Rule

  • 23

    These methods derive rules from another classification model.

    Indirect Methods

  • 24

    It’s possible, though more complex, to extract rules from this by analyzing the learned weights.

    Neural Networks

  • 25

    A model that can be converted into a set of classification rules

    Decision Trees

  • 26

    This makes rule-based classifiers highly interpretable compared to other methods like neural networks or support vector machines.

    Interpretability

  • 27

    The rules generated by this method are easy to understand and explain.

    Interpretability

  • 28

    Since rules are explicit, users can easily track how decisions are made, which is crucial for applications in sensitive fields like medical diagnosis or finance.

    Transparency

  • 29

    The system can be extended by adding more rules or adjusting the existing rules, allowing for greater adaptability to changes in the environment or new data.

    Flexibility

  • 30

    Rule-based systems often evaluate only a small subset of rules for each instance, reducing overall complexity.

    Efficiency

  • 31

    This happens when many specific rules are generated that capture noise rather than general patterns.

    Overfitting

  • 32

    Some rules may overlap and can complicate the classification process.

    Rule Redundancy

  • 33

    It is when rule-based classifiers with intricate patterns require more sophisticated models like ensemble methods or deep learning.

    Limited Performance on Complex Data

  • 34

    Applications of Rule-Based Classifiers

    Medical Diagnosis, Fraud Detection, Customer Segmentation

  • 35

    Businesses can classify customers based on behavioral data to tailor marketing strategies.

    Customer Segmentation

  • 36

    In banking and finance, rules can be used to detect fraudulent activities based on suspicious patterns.

    Fraud Detection

  • 37

    Rule-based classifiers can generate transparent decisionmaking models to assist healthcare professionals in diagnosing diseases.

    Medical Diagnosis

  • 38

    This offers an interpretable and intuitive approach to classifying data by deriving simple IF-THEN rules from training data.

    Rule-Based Classification

  • 39

    While this method excels in domains that require transparency and ease of understanding, it may struggle with complex or noisy datasets.

    Rule-Based Classification

  • 40

    With complex datas, it's often used in conjunction with other classification techniques for enhanced performance in real-world applications.

    Rule-based Classification

  • 41

    A process used by the companies to turn raw data to useful information.

    Data Mining

  • 42

    an essential tool that allows us to turn raw data into actionable insights

    Data Mining

  • 43

    6 Data Mining Tasks

    Classification, Clustering, Regression, Association Rule Learning, Anomaly Detection, Summarization

  • 44

    It's about predicting the category or class of a data point based on past data

    Classification

  • 45

    An example of this is Predicting whether an email is spam or not spam (spam filtering).

    Classification

  • 46

    The algorithm is trained on labeled data (data with known categories) and learns to assign new data points to one of these categories.

    Classification

  • 47

    Applications of classification

    Email Filtering, Medical Diagnosis, Sentiment Analysis, Image Recognition

  • 48

    Spam detection

    Email Filtering

  • 49

    Classifying diseases based on symptoms and test results.

    Medical Diagnosis

  • 50

    Determining whether a piece of text expresses positive, negative, or neutral sentiment.

    Sentiment Analysis

  • 51

    Classifying images into categories (e.g., identifying objects in photos).

    Image Recognition

  • 52

    It involves grouping similar data points together based on their characteristics without predefined labels.

    Clustering

  • 53

    An example of this is: Grouping customers based on their shopping habits.

    Clustering

  • 54

    Data points that are similar to each other are clustered into groups, helping to discover natural structures within the data.

    Clustering

  • 55

    Clustering Algorithms

    K-means, Hierarchical Clustering, DBSCAN, Gaussian Mixture Models

  • 56

    Partitions data into a fixed number of clusters based on distance to centroids.

    K-Means

  • 57

    Builds a tree of clusters based on similarity, allowing for a hierarchy of clusters.

    Hierarchical Clustering

  • 58

    Groups together points that are closely packed, marking points in low- density regions as outliers.

    DBSCAN

  • 59

    DBSCAN stands for

    Density-Based Spatial Clustering of Applications with Noise

  • 60

    Assumes data points are generated from a mixture of several Gaussian distributions.

    Gaussian Mixture Models

  • 61

    It ssion is used to predict a numerical value based on the relationship between variables

    Regression

  • 62

    An example of this is: Predicting house prices based on factors like location, size, and age.

    Regression

  • 63

    The algorithm tries to fit a line (or curve) that best describes the relationship between variables to predict continuous values.

    Regression

  • 64

    Common Regression Algorithms

    Linear Regression, Polynomial Regression, Ridge and Lasso Regression, Decision Trees and Random Forests, Support Vector Regression

  • 65

    Models the relationship between dependent and independent variables as a straight line.

    Linear Regression

  • 66

    Extends linear regression by fitting a polynomial equation to the data.

    Polynomial Regression

  • 67

    Regularization techniques that prevent overfitting by adding a penalty for large coefficients.

    Ridge and Lasso Regression

  • 68

    Can also be used for regression tasks by predicting values based on tree structures.

    Decision Trees and Random Forests

  • 69

    Uses support vector machines for regression tasks.

    Support Vector Regression

  • 70

    Applications of Regression

    Financial Forecasting, Sales Prediction, Risk Assessment, Marketing Response Modeling

  • 71

    It is about finding relationships between variables in a large dataset.

    Association Rule Learning

  • 72

    An example of this is: Market basket analysis, where you discover that customers who buy bread are likely to also buy butter

    Association Rule Learning

  • 73

    It identifies sets of items that frequently appear together in the data (like discovering frequent itemsets).

    Association Rule Learning

  • 74

    Applications of Association Rule Learning

    Market basket analysis to improve product placement and cross-selling strategies, Recommender systems in e-commerce, Customer segmentation and behavior analysis, Web usage mining to understand navigation patterns.

  • 75

    It identifies data points that deviate significantly from the normal pattern.

    Anomaly Detection

  • 76

    An example of this is: Detecting fraudulent credit card transactions.

    Anomaly Detection

  • 77

    In anomaly detection, the algorithm learns what normal data looks like, and anything that falls far outside this pattern is flagged as an what?

    Anomaly

  • 78

    3 types of Anomalies

    Point Anomalies, Contextual Anomalies, Collective Anomalies

  • 79

    A single data point that deviates significantly from the rest of the dataset. For example, a sudden spike in credit card transactions could indicate fraud.

    Point Anomalies

  • 80

    Anomalies that are only considered unusual in a specific context. For example, a high temperature reading is normal in summer but anomalous in winter.

    Contextual Anomalies

  • 81

    A group of data points that collectively deviate from the expected pattern, even if individual points are not anomalies. For instance, a sudden increase in network traffic over a short period may indicate a DDoS attack.

    Collective Anomalies

  • 82

    Applications of Anomaly Detection

    Fraud detection in banking and e-commerce, Network intrusion detection to identify malicious activity, Fault detection in manufacturing processes or machinery, Health monitoring to detect unusual patterns in patient data, Quality control in production processes.

  • 83

    This creates a compact representation of the data, providing a summary of key information

    Summarization

  • 84

    An example of this is: Summarizing a dataset by showing averages, counts, and other statistics.

    Summarization

  • 85

    This task reduces the complexity of the data by generating overviews and simplified reports.

    Summarization

  • 86

    2 types of summarization

    Extractive Summarization, Abstractive Summarization

  • 87

    Involves selecting key sentences or phrases directly from the original text to create a summary.

    Extractive Summarization

  • 88

    Techniques may include ranking sentences based on their importance using algorithms like TextRank or TF-IDF.

    Extractive Summarization

  • 89

    Involves generating new sentences that convey the main ideas of the text, potentially using different wording than the original.

    Abstractive Summarization

  • 90

    Often utilizes advanced machine learning models, such as transformer- based architectures (e.g., BERT, GPT).

    Abstractive Summarization

  • 91

    Applications of Summarization

    News aggregation services that provide concise articles, Summarizing research papers or reports for quick understanding, Document summarization in legal and business contexts, Enhancing user experience in chatbots by providing brief responses, Why These Tasks Matter

  • 92

    This helps in decision-making (e.g., identifying risks).

    Classification

  • 93

    It helps in segmenting customers or products for targeted marketing.

    Clustering

  • 94

    It helps in forecasting trends and making predictions.

    Regression

  • 95

    It helps businesses understand customer behavior.

    Association

  • 96

    This improves security by identifying irregularities.

    Anomaly Detection

  • 97

    It simplifies data interpretation by providing high-level insights.

    Summarization

  • Networking 2 quiz1

    Networking 2 quiz1

    Jamaica Rose Gilo · 83問 · 1年前

    Networking 2 quiz1

    Networking 2 quiz1

    83問 • 1年前
    Jamaica Rose Gilo

    MIDTERM-IT 313

    MIDTERM-IT 313

    Jamaica Rose Gilo · 100問 · 1年前

    MIDTERM-IT 313

    MIDTERM-IT 313

    100問 • 1年前
    Jamaica Rose Gilo

    MIDTERM-IT 313

    MIDTERM-IT 313

    Jamaica Rose Gilo · 94問 · 1年前

    MIDTERM-IT 313

    MIDTERM-IT 313

    94問 • 1年前
    Jamaica Rose Gilo

    MIDTERM-IT 313

    MIDTERM-IT 313

    Jamaica Rose Gilo · 58問 · 1年前

    MIDTERM-IT 313

    MIDTERM-IT 313

    58問 • 1年前
    Jamaica Rose Gilo

    MIDTERM-APPDEV 3 CHAPTER 5

    MIDTERM-APPDEV 3 CHAPTER 5

    Jamaica Rose Gilo · 42問 · 1年前

    MIDTERM-APPDEV 3 CHAPTER 5

    MIDTERM-APPDEV 3 CHAPTER 5

    42問 • 1年前
    Jamaica Rose Gilo

    MIDTERM-APPDEV 3 CHAPTER 4

    MIDTERM-APPDEV 3 CHAPTER 4

    Jamaica Rose Gilo · 48問 · 1年前

    MIDTERM-APPDEV 3 CHAPTER 4

    MIDTERM-APPDEV 3 CHAPTER 4

    48問 • 1年前
    Jamaica Rose Gilo

    Mga MALI mo!

    Mga MALI mo!

    Jamaica Rose Gilo · 29問 · 1年前

    Mga MALI mo!

    Mga MALI mo!

    29問 • 1年前
    Jamaica Rose Gilo

    INFO. ASSURANCE &SECURITY

    INFO. ASSURANCE &SECURITY

    Jamaica Rose Gilo · 100問 · 1年前

    INFO. ASSURANCE &SECURITY

    INFO. ASSURANCE &SECURITY

    100問 • 1年前
    Jamaica Rose Gilo

    INFO. ASSURANCE & SECURITY

    INFO. ASSURANCE & SECURITY

    Jamaica Rose Gilo · 33問 · 1年前

    INFO. ASSURANCE & SECURITY

    INFO. ASSURANCE & SECURITY

    33問 • 1年前
    Jamaica Rose Gilo

    3 BRANCHES OF THE PHIL. GOVERNMENT

    3 BRANCHES OF THE PHIL. GOVERNMENT

    Jamaica Rose Gilo · 20問 · 11ヶ月前

    3 BRANCHES OF THE PHIL. GOVERNMENT

    3 BRANCHES OF THE PHIL. GOVERNMENT

    20問 • 11ヶ月前
    Jamaica Rose Gilo

    ENVIRONMENTAL MANAGEMENT AND PROTECTION

    ENVIRONMENTAL MANAGEMENT AND PROTECTION

    Jamaica Rose Gilo · 16問 · 11ヶ月前

    ENVIRONMENTAL MANAGEMENT AND PROTECTION

    ENVIRONMENTAL MANAGEMENT AND PROTECTION

    16問 • 11ヶ月前
    Jamaica Rose Gilo

    ARTICLE VI-1987 CONST.

    ARTICLE VI-1987 CONST.

    Jamaica Rose Gilo · 16問 · 11ヶ月前

    ARTICLE VI-1987 CONST.

    ARTICLE VI-1987 CONST.

    16問 • 11ヶ月前
    Jamaica Rose Gilo

    ARTICLE 1-NATIONAL TERRITORY

    ARTICLE 1-NATIONAL TERRITORY

    Jamaica Rose Gilo · 10問 · 11ヶ月前

    ARTICLE 1-NATIONAL TERRITORY

    ARTICLE 1-NATIONAL TERRITORY

    10問 • 11ヶ月前
    Jamaica Rose Gilo

    ARTICLE II-DECLARATION FOR PRINCIPLES AND STATE POLICIES

    ARTICLE II-DECLARATION FOR PRINCIPLES AND STATE POLICIES

    Jamaica Rose Gilo · 11問 · 11ヶ月前

    ARTICLE II-DECLARATION FOR PRINCIPLES AND STATE POLICIES

    ARTICLE II-DECLARATION FOR PRINCIPLES AND STATE POLICIES

    11問 • 11ヶ月前
    Jamaica Rose Gilo

    ARTICLE III

    ARTICLE III

    Jamaica Rose Gilo · 6問 · 11ヶ月前

    ARTICLE III

    ARTICLE III

    6問 • 11ヶ月前
    Jamaica Rose Gilo

    BILL OF RIGHTS

    BILL OF RIGHTS

    Jamaica Rose Gilo · 20問 · 11ヶ月前

    BILL OF RIGHTS

    BILL OF RIGHTS

    20問 • 11ヶ月前
    Jamaica Rose Gilo

    PEACE & HUMAN RIGHTS ISSUES AND CONCEPTS

    PEACE & HUMAN RIGHTS ISSUES AND CONCEPTS

    Jamaica Rose Gilo · 8問 · 11ヶ月前

    PEACE & HUMAN RIGHTS ISSUES AND CONCEPTS

    PEACE & HUMAN RIGHTS ISSUES AND CONCEPTS

    8問 • 11ヶ月前
    Jamaica Rose Gilo

    RA 6713

    RA 6713

    Jamaica Rose Gilo · 9問 · 11ヶ月前

    RA 6713

    RA 6713

    9問 • 11ヶ月前
    Jamaica Rose Gilo

    MATH PROBLEMS AND BASIC OPERATIONS

    MATH PROBLEMS AND BASIC OPERATIONS

    Jamaica Rose Gilo · 62問 · 11ヶ月前

    MATH PROBLEMS AND BASIC OPERATIONS

    MATH PROBLEMS AND BASIC OPERATIONS

    62問 • 11ヶ月前
    Jamaica Rose Gilo

    問題一覧

  • 1

    fundamental technique in data mining used to classify data into different categories based on a set of predefined rules

    Rule-Based Classification

  • 2

    Rules used in Rule-Based Classification are often derived from what? making the process both intuitive and interpretable

    Data itself

  • 3

    supervised learning task where a model learns to map input data to a specific category or label

    Classification

  • 4

    The goal is to predict the class of unseen instances based on what was learned from the training data.

    Classification

  • 5

    takes the form: IF condition THEN class

    Rules

  • 6

    can be applied sequentially to classify data

    Rule Set

  • 7

    Rules are often ordered by a specific priority or confidence level.

    Rule Set

  • 8

    How Rule-Based Classification Works?

    Rule Generation, Rule Evaluation, Rule Matching, Conflict Resolution

  • 9

    Rules are generated from the training data based on the relationships between different features and the target class.

    Rule Generation

  • 10

    Techniques such as decision trees (e.g., ID3, C4.5), association rule mining (e.g., Apriori), or direct heuristics (e.g., covering algorithms) can be used to extract rules

    Rule Generation

  • 11

    Rules that correctly classify more instances with high accuracy are generally preferred.

    Rule Evaluation

  • 12

    Each rule is evaluated based on its accuracy or coverage.

    Rule Evaluation

  • 13

    When classifying a new instance, the rule-based system evaluates which rules match the instance’s attributes.

    Rule Matching

  • 14

    In some cases, multiple rules might apply to a single instance, leading to conflicts.

    Conflict Resolution

  • 15

    Conflict Resolution Strategies

    Rule Priority, Voting

  • 16

    Several rules contribute votes to different classes, and the majority vote determines the class.

    Voting

  • 17

    More specific rules or rules with higher confidence may take precedence

    Rule Priority

  • 18

    These methods generate classification rules directly from the data without creating an intermediate model.

    Direct Methods

  • 19

    an efficient algorithm for generating classification rules

    RIPPER

  • 20

    RIPPER stands for?

    Repeated Incremental Pruning to Produce Error Reduction

  • 21

    This is a simple algorithm that generates rules based on a single attribute at a time.

    OneR

  • 22

    OneR stands for?

    One Rule

  • 23

    These methods derive rules from another classification model.

    Indirect Methods

  • 24

    It’s possible, though more complex, to extract rules from this by analyzing the learned weights.

    Neural Networks

  • 25

    A model that can be converted into a set of classification rules

    Decision Trees

  • 26

    This makes rule-based classifiers highly interpretable compared to other methods like neural networks or support vector machines.

    Interpretability

  • 27

    The rules generated by this method are easy to understand and explain.

    Interpretability

  • 28

    Since rules are explicit, users can easily track how decisions are made, which is crucial for applications in sensitive fields like medical diagnosis or finance.

    Transparency

  • 29

    The system can be extended by adding more rules or adjusting the existing rules, allowing for greater adaptability to changes in the environment or new data.

    Flexibility

  • 30

    Rule-based systems often evaluate only a small subset of rules for each instance, reducing overall complexity.

    Efficiency

  • 31

    This happens when many specific rules are generated that capture noise rather than general patterns.

    Overfitting

  • 32

    Some rules may overlap and can complicate the classification process.

    Rule Redundancy

  • 33

    It is when rule-based classifiers with intricate patterns require more sophisticated models like ensemble methods or deep learning.

    Limited Performance on Complex Data

  • 34

    Applications of Rule-Based Classifiers

    Medical Diagnosis, Fraud Detection, Customer Segmentation

  • 35

    Businesses can classify customers based on behavioral data to tailor marketing strategies.

    Customer Segmentation

  • 36

    In banking and finance, rules can be used to detect fraudulent activities based on suspicious patterns.

    Fraud Detection

  • 37

    Rule-based classifiers can generate transparent decisionmaking models to assist healthcare professionals in diagnosing diseases.

    Medical Diagnosis

  • 38

    This offers an interpretable and intuitive approach to classifying data by deriving simple IF-THEN rules from training data.

    Rule-Based Classification

  • 39

    While this method excels in domains that require transparency and ease of understanding, it may struggle with complex or noisy datasets.

    Rule-Based Classification

  • 40

    With complex datas, it's often used in conjunction with other classification techniques for enhanced performance in real-world applications.

    Rule-based Classification

  • 41

    A process used by the companies to turn raw data to useful information.

    Data Mining

  • 42

    an essential tool that allows us to turn raw data into actionable insights

    Data Mining

  • 43

    6 Data Mining Tasks

    Classification, Clustering, Regression, Association Rule Learning, Anomaly Detection, Summarization

  • 44

    It's about predicting the category or class of a data point based on past data

    Classification

  • 45

    An example of this is Predicting whether an email is spam or not spam (spam filtering).

    Classification

  • 46

    The algorithm is trained on labeled data (data with known categories) and learns to assign new data points to one of these categories.

    Classification

  • 47

    Applications of classification

    Email Filtering, Medical Diagnosis, Sentiment Analysis, Image Recognition

  • 48

    Spam detection

    Email Filtering

  • 49

    Classifying diseases based on symptoms and test results.

    Medical Diagnosis

  • 50

    Determining whether a piece of text expresses positive, negative, or neutral sentiment.

    Sentiment Analysis

  • 51

    Classifying images into categories (e.g., identifying objects in photos).

    Image Recognition

  • 52

    It involves grouping similar data points together based on their characteristics without predefined labels.

    Clustering

  • 53

    An example of this is: Grouping customers based on their shopping habits.

    Clustering

  • 54

    Data points that are similar to each other are clustered into groups, helping to discover natural structures within the data.

    Clustering

  • 55

    Clustering Algorithms

    K-means, Hierarchical Clustering, DBSCAN, Gaussian Mixture Models

  • 56

    Partitions data into a fixed number of clusters based on distance to centroids.

    K-Means

  • 57

    Builds a tree of clusters based on similarity, allowing for a hierarchy of clusters.

    Hierarchical Clustering

  • 58

    Groups together points that are closely packed, marking points in low- density regions as outliers.

    DBSCAN

  • 59

    DBSCAN stands for

    Density-Based Spatial Clustering of Applications with Noise

  • 60

    Assumes data points are generated from a mixture of several Gaussian distributions.

    Gaussian Mixture Models

  • 61

    It ssion is used to predict a numerical value based on the relationship between variables

    Regression

  • 62

    An example of this is: Predicting house prices based on factors like location, size, and age.

    Regression

  • 63

    The algorithm tries to fit a line (or curve) that best describes the relationship between variables to predict continuous values.

    Regression

  • 64

    Common Regression Algorithms

    Linear Regression, Polynomial Regression, Ridge and Lasso Regression, Decision Trees and Random Forests, Support Vector Regression

  • 65

    Models the relationship between dependent and independent variables as a straight line.

    Linear Regression

  • 66

    Extends linear regression by fitting a polynomial equation to the data.

    Polynomial Regression

  • 67

    Regularization techniques that prevent overfitting by adding a penalty for large coefficients.

    Ridge and Lasso Regression

  • 68

    Can also be used for regression tasks by predicting values based on tree structures.

    Decision Trees and Random Forests

  • 69

    Uses support vector machines for regression tasks.

    Support Vector Regression

  • 70

    Applications of Regression

    Financial Forecasting, Sales Prediction, Risk Assessment, Marketing Response Modeling

  • 71

    It is about finding relationships between variables in a large dataset.

    Association Rule Learning

  • 72

    An example of this is: Market basket analysis, where you discover that customers who buy bread are likely to also buy butter

    Association Rule Learning

  • 73

    It identifies sets of items that frequently appear together in the data (like discovering frequent itemsets).

    Association Rule Learning

  • 74

    Applications of Association Rule Learning

    Market basket analysis to improve product placement and cross-selling strategies, Recommender systems in e-commerce, Customer segmentation and behavior analysis, Web usage mining to understand navigation patterns.

  • 75

    It identifies data points that deviate significantly from the normal pattern.

    Anomaly Detection

  • 76

    An example of this is: Detecting fraudulent credit card transactions.

    Anomaly Detection

  • 77

    In anomaly detection, the algorithm learns what normal data looks like, and anything that falls far outside this pattern is flagged as an what?

    Anomaly

  • 78

    3 types of Anomalies

    Point Anomalies, Contextual Anomalies, Collective Anomalies

  • 79

    A single data point that deviates significantly from the rest of the dataset. For example, a sudden spike in credit card transactions could indicate fraud.

    Point Anomalies

  • 80

    Anomalies that are only considered unusual in a specific context. For example, a high temperature reading is normal in summer but anomalous in winter.

    Contextual Anomalies

  • 81

    A group of data points that collectively deviate from the expected pattern, even if individual points are not anomalies. For instance, a sudden increase in network traffic over a short period may indicate a DDoS attack.

    Collective Anomalies

  • 82

    Applications of Anomaly Detection

    Fraud detection in banking and e-commerce, Network intrusion detection to identify malicious activity, Fault detection in manufacturing processes or machinery, Health monitoring to detect unusual patterns in patient data, Quality control in production processes.

  • 83

    This creates a compact representation of the data, providing a summary of key information

    Summarization

  • 84

    An example of this is: Summarizing a dataset by showing averages, counts, and other statistics.

    Summarization

  • 85

    This task reduces the complexity of the data by generating overviews and simplified reports.

    Summarization

  • 86

    2 types of summarization

    Extractive Summarization, Abstractive Summarization

  • 87

    Involves selecting key sentences or phrases directly from the original text to create a summary.

    Extractive Summarization

  • 88

    Techniques may include ranking sentences based on their importance using algorithms like TextRank or TF-IDF.

    Extractive Summarization

  • 89

    Involves generating new sentences that convey the main ideas of the text, potentially using different wording than the original.

    Abstractive Summarization

  • 90

    Often utilizes advanced machine learning models, such as transformer- based architectures (e.g., BERT, GPT).

    Abstractive Summarization

  • 91

    Applications of Summarization

    News aggregation services that provide concise articles, Summarizing research papers or reports for quick understanding, Document summarization in legal and business contexts, Enhancing user experience in chatbots by providing brief responses, Why These Tasks Matter

  • 92

    This helps in decision-making (e.g., identifying risks).

    Classification

  • 93

    It helps in segmenting customers or products for targeted marketing.

    Clustering

  • 94

    It helps in forecasting trends and making predictions.

    Regression

  • 95

    It helps businesses understand customer behavior.

    Association

  • 96

    This improves security by identifying irregularities.

    Anomaly Detection

  • 97

    It simplifies data interpretation by providing high-level insights.

    Summarization