Superwised learnings
email --> is_spam (spam filtering)
audio --> text transcription (speech recognition)
english --> french (machine translation)
visual inspection --> defect detection (quality control)
sequence of words --> next word prediction (chatbot)
LLM's
repeated predicts the next word
input, output.
my favorite drink, is.
my favorite drink is coffee.
my favorite drink is coffee, and I like it because it is.
output is sanitised, to be not offensive, or harmful.
Neural networks have improved the quality of LLMs, as you train them on more data, they get better.
What is data? (dataset)
No title
Data is unique to your business.
What is input and what is output?
size_of_house + no_of_bedrooms + location --> price_of_house
How do acquire data?
Manual labeling
Observing user behaviors user_id, time, price, has_purchased?
Observing machine behaviors machine_id, time, temperature, pressure, has_failed? Can tell if the machine is going to fail, based on the data.
Download it from websites / partnerships, (keep in mind licensing and copyright) Kaggle, UCI Machine Learning Repository, etc.
All data is not created equal, some data is more valuable than others.
More data doesn't always mean better results, quality of data is more important than quantity.
Data can have missing values, outliers, and noise. ( we need to clean up data before processing it )
Some time of data like images, audio, text, are unstructured data, which means they don't have a predefined format.
Techniques to deal with unstructured data is different from structured data.
The terminology of AI
ML vs data science
ML algos that uses data-sets, to get output from input. size_of_house + no_of_bedrooms + location --> price_of_house Field of study that gives computers the ability to learn without being explicitly programmed. Results in a software model that can execute to generate outputs for a certain input.
Data science team analyses data sets to find insights. "Did you know newly renovated houses sell for 20% more than non-renovated houses?" Science of extracting knowledge and insights from data. Side decks / Pitch decks for investors, to show them the insights.
Deep learning
Artificial neural networks.
Nodes and connections, similar to how the human brain works.
Its completely unrelated to how actual human brain works.
Takes input and outputs a prediction.
[[ Deep learning ], Machine learning, other tools], AI, other tools like knowledge graphs, rule-based systems, etc.]
What makes an AI company?
What makes a good internet company?
Shopping mall + Website != Internet company
A/B testing
Short iteration cycles
Data-driven decision making (engineers and product managers make decisions based on data, not executive opinions)
What makes a good AI company?
Strategic data acquisition
Unified data platform
Using AI to automate repetitive tasks
New roles and responsibilities (ML engineers, data scientists, etc.)
What ML can and cannot do?
spam? (0/1)
spam filtering
audio
text transcripts
speech recognition
English
Chinese
machine translation
ad, user info
click? (0/1)
online advertising
image, radar info
position of other cars
Self-driving car
image of phone
defect? (0/1)
visual inspection
sequence of words
the next word
chatbot
Imperfect rule of thumb: If a human can do it less than one sec of thought, then ML can do it too.
Feasible
Learning simple concepts
Having lots of data
eg:
Self driving cars, can guess where the other cars it saw are, based on their previous positions and what in front of them.
Not feasible
Complex concepts
With less sample data
eg:
If AI is trained on a certain type of data, it will not work on other types of data. (eg, if an AI is trained on lateral chest xrays to detect pneumonia, it will not work on frontal chest x-rays, or x-rays not aligned properly)
Human gestures interpretation, like roadworker stopping a car, is not feasible for AI to learn, as it requires complex understanding of human behavior and context, with large amount of data. (Even we struggle to understand it sometimes)
Non technical explanation of deep learning
flowchart LR
price[Price] --> n1((N1))
shipping_cost[Shipping Cost] --> n1
n1 --> |affordability| d1((Demand))
marketing[Marketing] --> n2((N2))
material[Material] --> n3((N3))
n2 --> | awareness | d1
n3 --> | perceived quality | d1
marketing --> n3
price --> n3The above neural network is a simple example of how deep learning works. It takes multiple inputs (price, shipping cost, marketing, material) and processes them through nodes (N1, N2, N3) to produce an output (demand).
It figures out the relationships on its self.
Feed lots of input for (price, shipping cost, marketing, material, demand) and it will figure out the relationships on its own.
For eg in a facedetection model, it will figure out the relationships between pixels, edges, and shapes to identify parts of the face, like eyes, nose, mouth, etc. And then combine them to identify the face as a whole.
Staring point of an AI project
What is the workflow of a machine learning project?
Key steps of a machine learning project
Alexa
Collect data of people saying (Alexa or other trigger words)
Train the model to recognize the trigger words (Iterate many times until its good enough)
Deploy model (Get data back, maintain and update model) (may or may not be possible, depending on the privacy and security policies of the company)
Self-driving car
Collect data of car positions (Red squares on pictures)
Train the model to predict the position of other cars
Deploy model
What is the workflow of a data science project?
Optimizing a sales funnel
Collect data of user behavior on the website (visits, clicks, time spent, etc.)
Analyze the data to find insights "Overseas users leave when they find high shipping costs" "Spend fewer marketing dollars on overseas users"
Suggest hypotheses to improve the sales funnel "Reduce shipping costs for overseas users" Re-analyze the data to see if the changes had an impact
Optimizing the manufacturing line
Mix clay, shape mug, add glaze, fire kiln, final inspection
Collect data of the manufacturing process (temperature, pressure, time, etc.)
Analyze the data to find insights "High temperature leads to more defects" "Reduce temperature to reduce defects" "Because ambient temperature is warmer in the afternoon, we need to reduce the temperature in the afternoon"
Suggest hypotheses to improve the manufacturing process, or yield "Reduce temperature to reduce defects" Re-analyze the data to see if the changes had an impact
Every job function needs to learn how to use data
Data science can optimise the sales funnel, machine learning, can automate lead sorting.
Data science can help optimize the manufacturing process, machine learning can automate quality control.
Data science can help optimize the recruiting funnel, machine learning can automate resume screening. (Your system should be ethical and not biased)
Data science can help optimise the user-experience of a website, machine learning can automate content recommendations, can suggest push notifications, etc.
Data science can help suggest what to plant when and where, machine learning can detect where weeds are and help automate weeding.
How to choose an AI project?
No title
What AI can do (AI experts)
Valuable for your business (domain experts)
Select a project that's overlapping
Framework for brainstorming AI projects
Thinking about automating tasks rather than automating jobs.
What are the main drivers of business value?
What are the main pain points in your business?
You can make progress even without big data
Having more data almost never hurts.
Data makes some businesses defensible. (difficult for new players to come in)
Even with small datasets, you can still make progress.
Due diligence before starting an AI project
What can AI do?, Whats? valuable for your business?
Should overlap b/w the above steps.
Technical diligence
Can a AI system meet desired performance? (eg, 95% accuracy)
How much data is needed to achieve that performance?
Engineering timeline
Business diligence
Lowering costs
Increases revenue
Launch a new product or service
Ethical diligence
Does it make the society better?
Build vs Buy?
ML projects can be in-house or outsourced.
DS projects are more commonly in-house. (Its so closely tied to your business, it makes sense to keep it in-house)
Some things will be industry standard, don't reinvent the wheel. (Don't try to outrun a train)
Working with an AI team
Specify an acceptance criteria
Goal: detect defects in coffee mugs, in 95% of cases. (statistically, avg)
Provide AI team with a dataset to measure performance. (test set, doesn't have to be tool large, 1000-2000 samples is enough)
Data
Training set (ok, defect, used to train the model, and create A -> B mapping)
Test set (will not be used to train the model, used to measure performance of the model, should be representative of the real world data)
Don't expect 100% accuracy
Limitation of ML
Insufficient data
Mislabeled labels
Ambiguous labels
AI tools
ML Frameworks
PyTorch, TensorFlow, Hugging Farce, PaddlePaddle, Scikit-learn, R.
Reasearch Publications
Arxiv
Open source projects
Github
Building AI in your company
Smart speaker example
"Hey device, tell me a joke"
Trigger word detection, (input audio, output has_trigger_word_spoken)
Speech recognition, (input audio, output text)
Intent recognition, (input text, output intent (joke? time? music? call? weather?))
Execute action, (If its a joke, then get a joke from the database, and return it as text)
AI Pipeline
flowchart LR
trigger_word_detection[Trigger word detection] --> speech_recognition[Speech recognition] --> intent_recognition[Intent recognition] --> execute_action[Execute action]"Hey device, set timer for 10 minutes"
Trigger word detection, (input audio, output has_trigger_word_spoken)
Speech recognition, (input audio, output text)
Intent recognition, (input text, output intent (set_timer?, timer_duration?))
Execute action, (If its a set_timer, then set the timer for timer_duration minutes)
Self driving cars
Image / Radar / Lidar
Object detection, (car detection, pedestrian detection, traffic sign detection, etc.)
Lane detection, (detect the lanes on the road)
Outputs the position of the lanes.
Trajectory prediction, (predict where the detected objects will be in the future)
Outputs the predicted position and speed of the detected objects.
Motion planning, (how to move the car, based on the detected objects, without collisions)
Outputs the path and speed of the car.
Path should avoid obstacles, and follow traffic rules.
Steer / Accelerate / Brake
flowchart LR
image_radar_lidar[Sensor data]
subgraph object_detection
lane_detection[Lane detection]
car_detection[Car detection]
pedestrian_detection[Pedestrian detection]
traffic_light_detection[Traffic light detection]
obstacle_detection[Obstacle detection]
end
trajectory_prediction[Trajectory prediction]
image_radar_lidar --> object_detection --> trajectory_prediction --> motion_planning --> steer_accelerate_brake[Steer / Accelerate / Brake]Roles and responsibilities in an AI team
Software Engineer
E.g., joke execution, timer execution, etc.
ML Engineer
Data gathering
Train a neural network
Test output
ML Researcher
Research new algorithms
Improve existing algorithms
Applied Learning Scientist
Somewhere b/w ML engineer and ML researcher
Data Scientist
Examine data and provide insights
Create dashboards, reports and presentations to team/executives.
Data Engineer
Organize data
Make sure data is saved securely, easily accessible and cost effectively.
AI product manager
Help decide what to build; whats feasible and valuable
You can start with a small team. You don't need a large team to start an AI project. Just you with a AI course and a dataset is enough to start.
AI transformation playbook
Execute small pilot projects to gain momentum
Can be in house or outsourced.
Show traction within 6/12 months.
Its more important for the first project to be successful, than to be big.
Build an in-house AI team
CEO, CAIO (Chief AI Officer)
AI team (central AI team)
Business unit 1
Business unit 2
Business unit 3 (gift card)
The central AI team, will be more like a consultancy, that helps the business units to implement AI in their projects.
They can help build company wise data infrastructures / platforms.
Better for AI team to have separate funding, rather than relying on the business units for funding.
Provide broad AI training
Executives and business leaders (What AI can do your enterprise, AI strategy, Resource allocation)
Pod leaders (Project direction, resource allocation, monitoring progress)
AI engineers (100hrs of training, Build and ship AI software, gather data, execute on specific AI projects)
Develop an AI strategy
Leverage AI to create an advantage specific to your industry sector.
Virtuous cycle of AI ( Better product --> More users --> More data --> Better product )
Consider creating a data strategy
Strategic data acquisition (Offer free services to collect data, Gmail)
Unified data warehouses (Collect data from all business units, and store it in a central place)
Create network effects and platform advantages
In industries where "winner takes all" is common, like social media, search engines, etc, AI can help accelerate the network effects.
Develop internal and external communications
Investor relations
Government relations (regulations, compliance, etc.)
Customer / user education
Talent / recruitment
Internal communications
AI pitfalls to avoid
Don't expect AI to solve all your problems, Be realistic about what AI can do.
Don't just hire 3/4 ML engineers and expect them to solve all your problems, You need a team with diverse skill sets, including data scientists, data engineers, software engineers, etc.
Don't expect AI projects to be successful in the first try, AI projects are iterative, you need to be prepared to fail and learn from your mistakes.
Traditional project planning doesn't work for AI projects, Work with your AI team to define the scope, timeline, and acceptance criteria for the project. AI KPIs are different from traditional software KPIs, you need to define them based on the AI project.
You don't need a superstar AI engineer to start an AI project, You can start with a small team, with online training.
Taking your first step in AI
Get friends to learn about AI
Start brainstorming projects
Hire a few ML/DS people to help
Hire or appoint an AI leader
Discuss with CEO/Board possibilities of AI transformation
Survey of major AI application areas
Supervised learning
Computer vision
Image classification (whole image is names) / Object recognition (parts of the image are named)
Facial recognition
Object detection, (finds position of objects in an image, and classifies them, draws a box around the object)
Image segmentation, (is this pixel part of a face? or a car? or a tree?, draws precise boundaries of objects in an image)
Tracking (follows objects in a video, like a car, or a person)
Natural language processing (NLP)
Text classification (spam detection, sentiment analysis, etc.)
Information retrieval (Search engines, question answering)
Name entity recognition (NER) (extracts names, dates, locations, etc. from text)
Machine translation (translates text from one language to another)
Speech processing
Microphone records very rapid air-pressure changes in the air
Takes as input audio, and outputs text
Trigger word detection (detects if a specific word is spoken, like "Alexa", "Hey Google", etc.)
Speaker ID, listens to someone speak and identifies who it is
Speech synthesis (text to speech, converts text to audio)
Generative AI
Creates high quality content, like images, text, audio, etc.
Input prompt, output content
Can create images, videos, text, audio, music, etc.
Robotics
Perception (figures out what is in the environment, based on sensor input data)
Motion planning (figures out how to move the robot, based on the perception data)
Control (executes the motion plan, and moves the robot)
General Machine learning
Unstructured data (images, audio, text, etc.)
Structured data (tabular data, like excel sheets, databases, etc.)
Unsupervised learning
Clustering
Price per packet vs No of packets sold
Detects purchase patterns in retail data
Groups similar items together, like customers, products, etc.
College kids purchase more energy drinks, and less coffee
Data is embedded in a high dimensional space, like price, quantity, location, etc.
Relationships between data points are constructed automatically, without any labels.
Can come up with new insights, like "customers who buy energy drinks also buy chips", or "customers who buy coffee also buy pastries".
Transfer learning
A model that is trained to detect cars with 100,000 images can be used to detect golf carts with 100 golf cart images.
Reinforcement learning
A drone leans to fly itself by trying different actions and getting feedback from the environment.
A pet dog learns to behave well by getting treats for good behavior and scolding for bad behavior.
Reinforcing good behavior and punishing bad behavior.
Uses a "reward signal" to tell when the AI is doing well or not.
Needs to re-iterate many times to learn the best actions. (We get a lot of data based on the training)
Generative adversarial networks (GANs)
Synthetic data generation
AI super models generation
Knowledge graph
A graph that represents knowledge in a structured way
Nodes represent entities, and edges represent relationships between entities
AI and Society
AI and hype
We should neither be optimistic or pessimistic about AI.
AI is a very powerful tool, but it has its limitations. We can mitigate its potential harms and use it to create tremendous value.
Limitations of AI
Explainablity is hard (AI needs to explain why it made a certain decision)
Bias, (If an AI is trained on biased data, it will produce biased results)
Susceptible to Adversarial attacks
AI, developing economies and jobs
Bias
AI learning unhealthy stereotypes.
AI can be racist, sexist, and biased, from data.
This is because training data has more associations for men with programming than with women.
If a face recognition system is trained on a dataset that has more images of white faces than black faces, it will perform better on white faces than black faces.
Banks may suggest lower credit limits for black people than white people, even if they have the same credit score.
An resume screening AI may favor more men than women, if its training data is biased.
Reducing bias in AI systems is paramount.
Combating bias
Zero out the bias in the words (Lets say "White programmer" is associated with 0.8, and "Black programmer" is associated with 0.2, then we can zero out the bias by making both associations equal to 0.5, in the data space)
Use less biased data.
Use a more inclusive data. (Make sure most races are represented in the data)
Audit to figure out if the AI is biased.
Diverse workforce. Having more inclusive workforce, can help reduce bias in AI systems.
Adversarial attacks on AI
AI can be fooled to spit out sensitive information.
AI can classify a hummingbird as a hammer, by making minor perturbation (changes) to the image.
Physical attacks, like putting on a specific sticker on a stop sign, can make the AI think its a speed limit sign.
Putting on a certain type of glasses can make the AI think its a different person.
AI can be fooled to misclassify images, by adding noise to the image.
Defenses
Ongoing research.
Like a spam vs anti-spam, we may be in a arms race for some application.
AI generated video detector, to detect if a video is real or fake.
Adverse uses of AI
DeepFakes
Oppressive surveillance
Fake reviews / Fake comments (political bots)
Spam vs Anti Spam; Fraud vs Anti Fraud;
AI and developing economies
There will be less opportunities for low-skilled workers.
AI will automate away certain jobs, like data entry, customer support, etc.
AI and jobs
There is a lat of uncertainty about how AI will impact jobs.
AI will create more jobs than it will displace.
Superwised learnings
email --> is_spam (spam filtering)
audio --> text transcription (speech recognition)
english --> french (machine translation)
visual inspection --> defect detection (quality control)
sequence of words --> next word prediction (chatbot)
LLM's
repeated predicts the next word
input, output.
my favorite drink, is.
my favorite drink is coffee.
my favorite drink is coffee, and I like it because it is.
output is sanitised, to be not offensive, or harmful.
Neural networks have improved the quality of LLMs, as you train them on more data, they get better.
What is data? (dataset)
No title
Data is unique to your business.
What is input and what is output?
size_of_house + no_of_bedrooms + location --> price_of_house
How do acquire data?
Manual labeling
Observing user behaviors user_id, time, price, has_purchased?
Observing machine behaviors machine_id, time, temperature, pressure, has_failed? Can tell if the machine is going to fail, based on the data.
Download it from websites / partnerships, (keep in mind licensing and copyright) Kaggle, UCI Machine Learning Repository, etc.
All data is not created equal, some data is more valuable than others.
More data doesn't always mean better results, quality of data is more important than quantity.
Data can have missing values, outliers, and noise. ( we need to clean up data before processing it )
Some time of data like images, audio, text, are unstructured data, which means they don't have a predefined format.
Techniques to deal with unstructured data is different from structured data.
The terminology of AI
ML vs data science
ML algos that uses data-sets, to get output from input. size_of_house + no_of_bedrooms + location --> price_of_house Field of study that gives computers the ability to learn without being explicitly programmed. Results in a software model that can execute to generate outputs for a certain input.
Data science team analyses data sets to find insights. "Did you know newly renovated houses sell for 20% more than non-renovated houses?" Science of extracting knowledge and insights from data. Side decks / Pitch decks for investors, to show them the insights.
Deep learning
Artificial neural networks.
Nodes and connections, similar to how the human brain works.
Its completely unrelated to how actual human brain works.
Takes input and outputs a prediction.
[[ Deep learning ], Machine learning, other tools], AI, other tools like knowledge graphs, rule-based systems, etc.]
What makes an AI company?
What makes a good internet company?
Shopping mall + Website != Internet company
A/B testing
Short iteration cycles
Data-driven decision making (engineers and product managers make decisions based on data, not executive opinions)
What makes a good AI company?
Strategic data acquisition
Unified data platform
Using AI to automate repetitive tasks
New roles and responsibilities (ML engineers, data scientists, etc.)
What ML can and cannot do?
spam? (0/1)
spam filtering
audio
text transcripts
speech recognition
English
Chinese
machine translation
ad, user info
click? (0/1)
online advertising
image, radar info
position of other cars
Self-driving car
image of phone
defect? (0/1)
visual inspection
sequence of words
the next word
chatbot
Imperfect rule of thumb: If a human can do it less than one sec of thought, then ML can do it too.
Feasible
Learning simple concepts
Having lots of data
eg:
Self driving cars, can guess where the other cars it saw are, based on their previous positions and what in front of them.
Not feasible
Complex concepts
With less sample data
eg:
If AI is trained on a certain type of data, it will not work on other types of data. (eg, if an AI is trained on lateral chest xrays to detect pneumonia, it will not work on frontal chest x-rays, or x-rays not aligned properly)
Human gestures interpretation, like roadworker stopping a car, is not feasible for AI to learn, as it requires complex understanding of human behavior and context, with large amount of data. (Even we struggle to understand it sometimes)
Non technical explanation of deep learning
flowchart LR
price[Price] --> n1((N1))
shipping_cost[Shipping Cost] --> n1
n1 --> |affordability| d1((Demand))
marketing[Marketing] --> n2((N2))
material[Material] --> n3((N3))
n2 --> | awareness | d1
n3 --> | perceived quality | d1
marketing --> n3
price --> n3The above neural network is a simple example of how deep learning works. It takes multiple inputs (price, shipping cost, marketing, material) and processes them through nodes (N1, N2, N3) to produce an output (demand).
It figures out the relationships on its self.
Feed lots of input for (price, shipping cost, marketing, material, demand) and it will figure out the relationships on its own.
For eg in a facedetection model, it will figure out the relationships between pixels, edges, and shapes to identify parts of the face, like eyes, nose, mouth, etc. And then combine them to identify the face as a whole.
Staring point of an AI project
What is the workflow of a machine learning project?
Key steps of a machine learning project
Alexa
Collect data of people saying (Alexa or other trigger words)
Train the model to recognize the trigger words (Iterate many times until its good enough)
Deploy model (Get data back, maintain and update model) (may or may not be possible, depending on the privacy and security policies of the company)
Self-driving car
Collect data of car positions (Red squares on pictures)
Train the model to predict the position of other cars
Deploy model
What is the workflow of a data science project?
Optimizing a sales funnel
Collect data of user behavior on the website (visits, clicks, time spent, etc.)
Analyze the data to find insights "Overseas users leave when they find high shipping costs" "Spend fewer marketing dollars on overseas users"
Suggest hypotheses to improve the sales funnel "Reduce shipping costs for overseas users" Re-analyze the data to see if the changes had an impact
Optimizing the manufacturing line
Mix clay, shape mug, add glaze, fire kiln, final inspection
Collect data of the manufacturing process (temperature, pressure, time, etc.)
Analyze the data to find insights "High temperature leads to more defects" "Reduce temperature to reduce defects" "Because ambient temperature is warmer in the afternoon, we need to reduce the temperature in the afternoon"
Suggest hypotheses to improve the manufacturing process, or yield "Reduce temperature to reduce defects" Re-analyze the data to see if the changes had an impact
Every job function needs to learn how to use data
Data science can optimise the sales funnel, machine learning, can automate lead sorting.
Data science can help optimize the manufacturing process, machine learning can automate quality control.
Data science can help optimize the recruiting funnel, machine learning can automate resume screening. (Your system should be ethical and not biased)
Data science can help optimise the user-experience of a website, machine learning can automate content recommendations, can suggest push notifications, etc.
Data science can help suggest what to plant when and where, machine learning can detect where weeds are and help automate weeding.
How to choose an AI project?
No title
What AI can do (AI experts)
Valuable for your business (domain experts)
Select a project that's overlapping
Framework for brainstorming AI projects
Thinking about automating tasks rather than automating jobs.
What are the main drivers of business value?
What are the main pain points in your business?
You can make progress even without big data
Having more data almost never hurts.
Data makes some businesses defensible. (difficult for new players to come in)
Even with small datasets, you can still make progress.
Due diligence before starting an AI project
What can AI do?, Whats? valuable for your business?
Should overlap b/w the above steps.
Technical diligence
Can a AI system meet desired performance? (eg, 95% accuracy)
How much data is needed to achieve that performance?
Engineering timeline
Business diligence
Lowering costs
Increases revenue
Launch a new product or service
Ethical diligence
Does it make the society better?
Build vs Buy?
ML projects can be in-house or outsourced.
DS projects are more commonly in-house. (Its so closely tied to your business, it makes sense to keep it in-house)
Some things will be industry standard, don't reinvent the wheel. (Don't try to outrun a train)
Working with an AI team
Specify an acceptance criteria
Goal: detect defects in coffee mugs, in 95% of cases. (statistically, avg)
Provide AI team with a dataset to measure performance. (test set, doesn't have to be tool large, 1000-2000 samples is enough)
Data
Training set (ok, defect, used to train the model, and create A -> B mapping)
Test set (will not be used to train the model, used to measure performance of the model, should be representative of the real world data)
Don't expect 100% accuracy
Limitation of ML
Insufficient data
Mislabeled labels
Ambiguous labels
AI tools
ML Frameworks
PyTorch, TensorFlow, Hugging Farce, PaddlePaddle, Scikit-learn, R.
Reasearch Publications
Arxiv
Open source projects
Github
Building AI in your company
Smart speaker example
"Hey device, tell me a joke"
Trigger word detection, (input audio, output has_trigger_word_spoken)
Speech recognition, (input audio, output text)
Intent recognition, (input text, output intent (joke? time? music? call? weather?))
Execute action, (If its a joke, then get a joke from the database, and return it as text)
AI Pipeline
flowchart LR
trigger_word_detection[Trigger word detection] --> speech_recognition[Speech recognition] --> intent_recognition[Intent recognition] --> execute_action[Execute action]"Hey device, set timer for 10 minutes"
Trigger word detection, (input audio, output has_trigger_word_spoken)
Speech recognition, (input audio, output text)
Intent recognition, (input text, output intent (set_timer?, timer_duration?))
Execute action, (If its a set_timer, then set the timer for timer_duration minutes)
Self driving cars
Image / Radar / Lidar
Object detection, (car detection, pedestrian detection, traffic sign detection, etc.)
Lane detection, (detect the lanes on the road)
Outputs the position of the lanes.
Trajectory prediction, (predict where the detected objects will be in the future)
Outputs the predicted position and speed of the detected objects.
Motion planning, (how to move the car, based on the detected objects, without collisions)
Outputs the path and speed of the car.
Path should avoid obstacles, and follow traffic rules.
Steer / Accelerate / Brake
flowchart LR
image_radar_lidar[Sensor data]
subgraph object_detection
lane_detection[Lane detection]
car_detection[Car detection]
pedestrian_detection[Pedestrian detection]
traffic_light_detection[Traffic light detection]
obstacle_detection[Obstacle detection]
end
trajectory_prediction[Trajectory prediction]
image_radar_lidar --> object_detection --> trajectory_prediction --> motion_planning --> steer_accelerate_brake[Steer / Accelerate / Brake]Roles and responsibilities in an AI team
Software Engineer
E.g., joke execution, timer execution, etc.
ML Engineer
Data gathering
Train a neural network
Test output
ML Researcher
Research new algorithms
Improve existing algorithms
Applied Learning Scientist
Somewhere b/w ML engineer and ML researcher
Data Scientist
Examine data and provide insights
Create dashboards, reports and presentations to team/executives.
Data Engineer
Organize data
Make sure data is saved securely, easily accessible and cost effectively.
AI product manager
Help decide what to build; whats feasible and valuable
You can start with a small team. You don't need a large team to start an AI project. Just you with a AI course and a dataset is enough to start.
AI transformation playbook
Execute small pilot projects to gain momentum
Can be in house or outsourced.
Show traction within 6/12 months.
Its more important for the first project to be successful, than to be big.
Build an in-house AI team
CEO, CAIO (Chief AI Officer)
AI team (central AI team)
Business unit 1
Business unit 2
Business unit 3 (gift card)
The central AI team, will be more like a consultancy, that helps the business units to implement AI in their projects.
They can help build company wise data infrastructures / platforms.
Better for AI team to have separate funding, rather than relying on the business units for funding.
Provide broad AI training
Executives and business leaders (What AI can do your enterprise, AI strategy, Resource allocation)
Pod leaders (Project direction, resource allocation, monitoring progress)
AI engineers (100hrs of training, Build and ship AI software, gather data, execute on specific AI projects)
Develop an AI strategy
Leverage AI to create an advantage specific to your industry sector.
Virtuous cycle of AI ( Better product --> More users --> More data --> Better product )
Consider creating a data strategy
Strategic data acquisition (Offer free services to collect data, Gmail)
Unified data warehouses (Collect data from all business units, and store it in a central place)
Create network effects and platform advantages
In industries where "winner takes all" is common, like social media, search engines, etc, AI can help accelerate the network effects.
Develop internal and external communications
Investor relations
Government relations (regulations, compliance, etc.)
Customer / user education
Talent / recruitment
Internal communications
AI pitfalls to avoid
Don't expect AI to solve all your problems, Be realistic about what AI can do.
Don't just hire 3/4 ML engineers and expect them to solve all your problems, You need a team with diverse skill sets, including data scientists, data engineers, software engineers, etc.
Don't expect AI projects to be successful in the first try, AI projects are iterative, you need to be prepared to fail and learn from your mistakes.
Traditional project planning doesn't work for AI projects, Work with your AI team to define the scope, timeline, and acceptance criteria for the project. AI KPIs are different from traditional software KPIs, you need to define them based on the AI project.
You don't need a superstar AI engineer to start an AI project, You can start with a small team, with online training.
Taking your first step in AI
Get friends to learn about AI
Start brainstorming projects
Hire a few ML/DS people to help
Hire or appoint an AI leader
Discuss with CEO/Board possibilities of AI transformation
Survey of major AI application areas
Supervised learning
Computer vision
Image classification (whole image is names) / Object recognition (parts of the image are named)
Facial recognition
Object detection, (finds position of objects in an image, and classifies them, draws a box around the object)
Image segmentation, (is this pixel part of a face? or a car? or a tree?, draws precise boundaries of objects in an image)
Tracking (follows objects in a video, like a car, or a person)
Natural language processing (NLP)
Text classification (spam detection, sentiment analysis, etc.)
Information retrieval (Search engines, question answering)
Name entity recognition (NER) (extracts names, dates, locations, etc. from text)
Machine translation (translates text from one language to another)
Speech processing
Microphone records very rapid air-pressure changes in the air
Takes as input audio, and outputs text
Trigger word detection (detects if a specific word is spoken, like "Alexa", "Hey Google", etc.)
Speaker ID, listens to someone speak and identifies who it is
Speech synthesis (text to speech, converts text to audio)
Generative AI
Creates high quality content, like images, text, audio, etc.
Input prompt, output content
Can create images, videos, text, audio, music, etc.
Robotics
Perception (figures out what is in the environment, based on sensor input data)
Motion planning (figures out how to move the robot, based on the perception data)
Control (executes the motion plan, and moves the robot)
General Machine learning
Unstructured data (images, audio, text, etc.)
Structured data (tabular data, like excel sheets, databases, etc.)
Unsupervised learning
Clustering
Price per packet vs No of packets sold
Detects purchase patterns in retail data
Groups similar items together, like customers, products, etc.
College kids purchase more energy drinks, and less coffee
Data is embedded in a high dimensional space, like price, quantity, location, etc.
Relationships between data points are constructed automatically, without any labels.
Can come up with new insights, like "customers who buy energy drinks also buy chips", or "customers who buy coffee also buy pastries".
Transfer learning
A model that is trained to detect cars with 100,000 images can be used to detect golf carts with 100 golf cart images.
Reinforcement learning
A drone leans to fly itself by trying different actions and getting feedback from the environment.
A pet dog learns to behave well by getting treats for good behavior and scolding for bad behavior.
Reinforcing good behavior and punishing bad behavior.
Uses a "reward signal" to tell when the AI is doing well or not.
Needs to re-iterate many times to learn the best actions. (We get a lot of data based on the training)
Generative adversarial networks (GANs)
Synthetic data generation
AI super models generation
Knowledge graph
A graph that represents knowledge in a structured way
Nodes represent entities, and edges represent relationships between entities
AI and Society
AI and hype
We should neither be optimistic or pessimistic about AI.
AI is a very powerful tool, but it has its limitations. We can mitigate its potential harms and use it to create tremendous value.
Limitations of AI
Explainablity is hard (AI needs to explain why it made a certain decision)
Bias, (If an AI is trained on biased data, it will produce biased results)
Susceptible to Adversarial attacks
AI, developing economies and jobs
Bias
AI learning unhealthy stereotypes.
AI can be racist, sexist, and biased, from data.
This is because training data has more associations for men with programming than with women.
If a face recognition system is trained on a dataset that has more images of white faces than black faces, it will perform better on white faces than black faces.
Banks may suggest lower credit limits for black people than white people, even if they have the same credit score.
An resume screening AI may favor more men than women, if its training data is biased.
Reducing bias in AI systems is paramount.
Combating bias
Zero out the bias in the words (Lets say "White programmer" is associated with 0.8, and "Black programmer" is associated with 0.2, then we can zero out the bias by making both associations equal to 0.5, in the data space)
Use less biased data.
Use a more inclusive data. (Make sure most races are represented in the data)
Audit to figure out if the AI is biased.
Diverse workforce. Having more inclusive workforce, can help reduce bias in AI systems.
Adversarial attacks on AI
AI can be fooled to spit out sensitive information.
AI can classify a hummingbird as a hammer, by making minor perturbation (changes) to the image.
Physical attacks, like putting on a specific sticker on a stop sign, can make the AI think its a speed limit sign.
Putting on a certain type of glasses can make the AI think its a different person.
AI can be fooled to misclassify images, by adding noise to the image.
Defenses
Ongoing research.
Like a spam vs anti-spam, we may be in a arms race for some application.
AI generated video detector, to detect if a video is real or fake.
Adverse uses of AI
DeepFakes
Oppressive surveillance
Fake reviews / Fake comments (political bots)
Spam vs Anti Spam; Fraud vs Anti Fraud;
AI and developing economies
There will be less opportunities for low-skilled workers.
AI will automate away certain jobs, like data entry, customer support, etc.
AI and jobs
There is a lat of uncertainty about how AI will impact jobs.
AI will create more jobs than it will displace.
Superwised learnings
email --> is_spam (spam filtering)
audio --> text transcription (speech recognition)
english --> french (machine translation)
visual inspection --> defect detection (quality control)
sequence of words --> next word prediction (chatbot)
LLM's
repeated predicts the next word
input, output.
my favorite drink, is.
my favorite drink is coffee.
my favorite drink is coffee, and I like it because it is.
output is sanitised, to be not offensive, or harmful.
Neural networks have improved the quality of LLMs, as you train them on more data, they get better.
What is data? (dataset)
No title
Data is unique to your business.
What is input and what is output?
size_of_house + no_of_bedrooms + location --> price_of_house
How do acquire data?
Manual labeling
Observing user behaviors user_id, time, price, has_purchased?
Observing machine behaviors machine_id, time, temperature, pressure, has_failed? Can tell if the machine is going to fail, based on the data.
Download it from websites / partnerships, (keep in mind licensing and copyright) Kaggle, UCI Machine Learning Repository, etc.
All data is not created equal, some data is more valuable than others.
More data doesn't always mean better results, quality of data is more important than quantity.
Data can have missing values, outliers, and noise. ( we need to clean up data before processing it )
Some time of data like images, audio, text, are unstructured data, which means they don't have a predefined format.
Techniques to deal with unstructured data is different from structured data.
How do acquire data?
Manual labeling
Observing user behaviors user_id, time, price, has_purchased?
Observing machine behaviors machine_id, time, temperature, pressure, has_failed? Can tell if the machine is going to fail, based on the data.
Download it from websites / partnerships, (keep in mind licensing and copyright) Kaggle, UCI Machine Learning Repository, etc.
All data is not created equal, some data is more valuable than others.
More data doesn't always mean better results, quality of data is more important than quantity.
Data can have missing values, outliers, and noise. ( we need to clean up data before processing it )
Some time of data like images, audio, text, are unstructured data, which means they don't have a predefined format.
Techniques to deal with unstructured data is different from structured data.
The terminology of AI
ML vs data science
ML algos that uses data-sets, to get output from input. size_of_house + no_of_bedrooms + location --> price_of_house Field of study that gives computers the ability to learn without being explicitly programmed. Results in a software model that can execute to generate outputs for a certain input.
Data science team analyses data sets to find insights. "Did you know newly renovated houses sell for 20% more than non-renovated houses?" Science of extracting knowledge and insights from data. Side decks / Pitch decks for investors, to show them the insights.
Deep learning
Artificial neural networks.
Nodes and connections, similar to how the human brain works.
Its completely unrelated to how actual human brain works.
Takes input and outputs a prediction.
[[ Deep learning ], Machine learning, other tools], AI, other tools like knowledge graphs, rule-based systems, etc.]
ML vs data science
ML algos that uses data-sets, to get output from input. size_of_house + no_of_bedrooms + location --> price_of_house Field of study that gives computers the ability to learn without being explicitly programmed. Results in a software model that can execute to generate outputs for a certain input.
Data science team analyses data sets to find insights. "Did you know newly renovated houses sell for 20% more than non-renovated houses?" Science of extracting knowledge and insights from data. Side decks / Pitch decks for investors, to show them the insights.
Deep learning
Artificial neural networks.
Nodes and connections, similar to how the human brain works.
Its completely unrelated to how actual human brain works.
Takes input and outputs a prediction.
[[ Deep learning ], Machine learning, other tools], AI, other tools like knowledge graphs, rule-based systems, etc.]
What makes an AI company?
What makes a good internet company?
Shopping mall + Website != Internet company
A/B testing
Short iteration cycles
Data-driven decision making (engineers and product managers make decisions based on data, not executive opinions)
What makes a good AI company?
Strategic data acquisition
Unified data platform
Using AI to automate repetitive tasks
New roles and responsibilities (ML engineers, data scientists, etc.)
What makes a good internet company?
Shopping mall + Website != Internet company
A/B testing
Short iteration cycles
Data-driven decision making (engineers and product managers make decisions based on data, not executive opinions)
What makes a good AI company?
Strategic data acquisition
Unified data platform
Using AI to automate repetitive tasks
New roles and responsibilities (ML engineers, data scientists, etc.)
What ML can and cannot do?
No title
spam? (0/1)
spam filtering
audio
text transcripts
speech recognition
English
Chinese
machine translation
ad, user info
click? (0/1)
online advertising
image, radar info
position of other cars
Self-driving car
image of phone
defect? (0/1)
visual inspection
sequence of words
the next word
chatbot
Imperfect rule of thumb: If a human can do it less than one sec of thought, then ML can do it too.
Feasible
Learning simple concepts
Having lots of data
eg:
Self driving cars, can guess where the other cars it saw are, based on their previous positions and what in front of them.
Not feasible
Complex concepts
With less sample data
eg:
If AI is trained on a certain type of data, it will not work on other types of data. (eg, if an AI is trained on lateral chest xrays to detect pneumonia, it will not work on frontal chest x-rays, or x-rays not aligned properly)
Human gestures interpretation, like roadworker stopping a car, is not feasible for AI to learn, as it requires complex understanding of human behavior and context, with large amount of data. (Even we struggle to understand it sometimes)
No title
spam? (0/1)
spam filtering
audio
text transcripts
speech recognition
English
Chinese
machine translation
ad, user info
click? (0/1)
online advertising
image, radar info
position of other cars
Self-driving car
image of phone
defect? (0/1)
visual inspection
sequence of words
the next word
chatbot
Imperfect rule of thumb: If a human can do it less than one sec of thought, then ML can do it too.
Feasible
Learning simple concepts
Having lots of data
eg:
Self driving cars, can guess where the other cars it saw are, based on their previous positions and what in front of them.
Not feasible
Complex concepts
With less sample data
eg:
If AI is trained on a certain type of data, it will not work on other types of data. (eg, if an AI is trained on lateral chest xrays to detect pneumonia, it will not work on frontal chest x-rays, or x-rays not aligned properly)
Human gestures interpretation, like roadworker stopping a car, is not feasible for AI to learn, as it requires complex understanding of human behavior and context, with large amount of data. (Even we struggle to understand it sometimes)
Non technical explanation of deep learning
Rendering diagram...
The above neural network is a simple example of how deep learning works. It takes multiple inputs (price, shipping cost, marketing, material) and processes them through nodes (N1, N2, N3) to produce an output (demand).
It figures out the relationships on its self.
Feed lots of input for (price, shipping cost, marketing, material, demand) and it will figure out the relationships on its own.
For eg in a facedetection model, it will figure out the relationships between pixels, edges, and shapes to identify parts of the face, like eyes, nose, mouth, etc. And then combine them to identify the face as a whole.
Rendering diagram...
Staring point of an AI project
What is the workflow of a machine learning project?
Key steps of a machine learning project
Alexa
Collect data of people saying (Alexa or other trigger words)
Train the model to recognize the trigger words (Iterate many times until its good enough)
Deploy model (Get data back, maintain and update model) (may or may not be possible, depending on the privacy and security policies of the company)
Self-driving car
Collect data of car positions (Red squares on pictures)
Train the model to predict the position of other cars
Deploy model
What is the workflow of a data science project?
Optimizing a sales funnel
Collect data of user behavior on the website (visits, clicks, time spent, etc.)
Analyze the data to find insights "Overseas users leave when they find high shipping costs" "Spend fewer marketing dollars on overseas users"
Suggest hypotheses to improve the sales funnel "Reduce shipping costs for overseas users" Re-analyze the data to see if the changes had an impact
Optimizing the manufacturing line
Mix clay, shape mug, add glaze, fire kiln, final inspection
Collect data of the manufacturing process (temperature, pressure, time, etc.)
Analyze the data to find insights "High temperature leads to more defects" "Reduce temperature to reduce defects" "Because ambient temperature is warmer in the afternoon, we need to reduce the temperature in the afternoon"
Suggest hypotheses to improve the manufacturing process, or yield "Reduce temperature to reduce defects" Re-analyze the data to see if the changes had an impact
Every job function needs to learn how to use data
Data science can optimise the sales funnel, machine learning, can automate lead sorting.
Data science can help optimize the manufacturing process, machine learning can automate quality control.
Data science can help optimize the recruiting funnel, machine learning can automate resume screening. (Your system should be ethical and not biased)
Data science can help optimise the user-experience of a website, machine learning can automate content recommendations, can suggest push notifications, etc.
Data science can help suggest what to plant when and where, machine learning can detect where weeds are and help automate weeding.
How to choose an AI project?
No title
What AI can do (AI experts)
Valuable for your business (domain experts)
Select a project that's overlapping
Framework for brainstorming AI projects
Thinking about automating tasks rather than automating jobs.
What are the main drivers of business value?
What are the main pain points in your business?
You can make progress even without big data
Having more data almost never hurts.
Data makes some businesses defensible. (difficult for new players to come in)
Even with small datasets, you can still make progress.
Due diligence before starting an AI project
What can AI do?, Whats? valuable for your business?
Should overlap b/w the above steps.
Technical diligence
Can a AI system meet desired performance? (eg, 95% accuracy)
How much data is needed to achieve that performance?
Engineering timeline
Business diligence
Lowering costs
Increases revenue
Launch a new product or service
Ethical diligence
Does it make the society better?
Build vs Buy?
ML projects can be in-house or outsourced.
DS projects are more commonly in-house. (Its so closely tied to your business, it makes sense to keep it in-house)
Some things will be industry standard, don't reinvent the wheel. (Don't try to outrun a train)
Working with an AI team
Specify an acceptance criteria
Goal: detect defects in coffee mugs, in 95% of cases. (statistically, avg)
Provide AI team with a dataset to measure performance. (test set, doesn't have to be tool large, 1000-2000 samples is enough)
Data
Training set (ok, defect, used to train the model, and create A -> B mapping)
Test set (will not be used to train the model, used to measure performance of the model, should be representative of the real world data)
Don't expect 100% accuracy
Limitation of ML
Insufficient data
Mislabeled labels
Ambiguous labels
AI tools
ML Frameworks
PyTorch, TensorFlow, Hugging Farce, PaddlePaddle, Scikit-learn, R.
Reasearch Publications
Arxiv
Open source projects
Github
What is the workflow of a machine learning project?
Key steps of a machine learning project
Alexa
Collect data of people saying (Alexa or other trigger words)
Train the model to recognize the trigger words (Iterate many times until its good enough)
Deploy model (Get data back, maintain and update model) (may or may not be possible, depending on the privacy and security policies of the company)
Self-driving car
Collect data of car positions (Red squares on pictures)
Train the model to predict the position of other cars
Deploy model
Key steps of a machine learning project
Alexa
Collect data of people saying (Alexa or other trigger words)
Train the model to recognize the trigger words (Iterate many times until its good enough)
Deploy model (Get data back, maintain and update model) (may or may not be possible, depending on the privacy and security policies of the company)
Self-driving car
Collect data of car positions (Red squares on pictures)
Train the model to predict the position of other cars
Deploy model
What is the workflow of a data science project?
Optimizing a sales funnel
Collect data of user behavior on the website (visits, clicks, time spent, etc.)
Analyze the data to find insights "Overseas users leave when they find high shipping costs" "Spend fewer marketing dollars on overseas users"
Suggest hypotheses to improve the sales funnel "Reduce shipping costs for overseas users" Re-analyze the data to see if the changes had an impact
Optimizing the manufacturing line
Mix clay, shape mug, add glaze, fire kiln, final inspection
Collect data of the manufacturing process (temperature, pressure, time, etc.)
Analyze the data to find insights "High temperature leads to more defects" "Reduce temperature to reduce defects" "Because ambient temperature is warmer in the afternoon, we need to reduce the temperature in the afternoon"
Suggest hypotheses to improve the manufacturing process, or yield "Reduce temperature to reduce defects" Re-analyze the data to see if the changes had an impact
Every job function needs to learn how to use data
Data science can optimise the sales funnel, machine learning, can automate lead sorting.
Data science can help optimize the manufacturing process, machine learning can automate quality control.
Data science can help optimize the recruiting funnel, machine learning can automate resume screening. (Your system should be ethical and not biased)
Data science can help optimise the user-experience of a website, machine learning can automate content recommendations, can suggest push notifications, etc.
Data science can help suggest what to plant when and where, machine learning can detect where weeds are and help automate weeding.
How to choose an AI project?
No title
What AI can do (AI experts)
Valuable for your business (domain experts)
Select a project that's overlapping
Framework for brainstorming AI projects
Thinking about automating tasks rather than automating jobs.
What are the main drivers of business value?
What are the main pain points in your business?
You can make progress even without big data
Having more data almost never hurts.
Data makes some businesses defensible. (difficult for new players to come in)
Even with small datasets, you can still make progress.
Due diligence before starting an AI project
What can AI do?, Whats? valuable for your business?
Should overlap b/w the above steps.
Technical diligence
Can a AI system meet desired performance? (eg, 95% accuracy)
How much data is needed to achieve that performance?
Engineering timeline
Business diligence
Lowering costs
Increases revenue
Launch a new product or service
Ethical diligence
Does it make the society better?
Build vs Buy?
ML projects can be in-house or outsourced.
DS projects are more commonly in-house. (Its so closely tied to your business, it makes sense to keep it in-house)
Some things will be industry standard, don't reinvent the wheel. (Don't try to outrun a train)
Framework for brainstorming AI projects
Thinking about automating tasks rather than automating jobs.
What are the main drivers of business value?
What are the main pain points in your business?
You can make progress even without big data
Having more data almost never hurts.
Data makes some businesses defensible. (difficult for new players to come in)
Even with small datasets, you can still make progress.
Due diligence before starting an AI project
What can AI do?, Whats? valuable for your business?
Should overlap b/w the above steps.
Technical diligence
Can a AI system meet desired performance? (eg, 95% accuracy)
How much data is needed to achieve that performance?
Engineering timeline
Build vs Buy?
ML projects can be in-house or outsourced.
DS projects are more commonly in-house. (Its so closely tied to your business, it makes sense to keep it in-house)
Some things will be industry standard, don't reinvent the wheel. (Don't try to outrun a train)
Working with an AI team
Specify an acceptance criteria
Goal: detect defects in coffee mugs, in 95% of cases. (statistically, avg)
Provide AI team with a dataset to measure performance. (test set, doesn't have to be tool large, 1000-2000 samples is enough)
Data
Training set (ok, defect, used to train the model, and create A -> B mapping)
Test set (will not be used to train the model, used to measure performance of the model, should be representative of the real world data)
Don't expect 100% accuracy
Limitation of ML
Insufficient data
Mislabeled labels
Ambiguous labels
AI tools
ML Frameworks
PyTorch, TensorFlow, Hugging Farce, PaddlePaddle, Scikit-learn, R.
Reasearch Publications
Arxiv
Open source projects
Github
Building AI in your company
Smart speaker example
"Hey device, tell me a joke"
Trigger word detection, (input audio, output has_trigger_word_spoken)
Speech recognition, (input audio, output text)
Intent recognition, (input text, output intent (joke? time? music? call? weather?))
Execute action, (If its a joke, then get a joke from the database, and return it as text)
AI Pipeline
flowchart LR
trigger_word_detection[Trigger word detection] --> speech_recognition[Speech recognition] --> intent_recognition[Intent recognition] --> execute_action[Execute action]"Hey device, set timer for 10 minutes"
Trigger word detection, (input audio, output has_trigger_word_spoken)
Speech recognition, (input audio, output text)
Intent recognition, (input text, output intent (set_timer?, timer_duration?))
Execute action, (If its a set_timer, then set the timer for timer_duration minutes)
Self driving cars
Image / Radar / Lidar
Object detection, (car detection, pedestrian detection, traffic sign detection, etc.)
Lane detection, (detect the lanes on the road)
Outputs the position of the lanes.
Trajectory prediction, (predict where the detected objects will be in the future)
Outputs the predicted position and speed of the detected objects.
Motion planning, (how to move the car, based on the detected objects, without collisions)
Outputs the path and speed of the car.
Path should avoid obstacles, and follow traffic rules.
Steer / Accelerate / Brake
flowchart LR
image_radar_lidar[Sensor data]
subgraph object_detection
lane_detection[Lane detection]
car_detection[Car detection]
pedestrian_detection[Pedestrian detection]
traffic_light_detection[Traffic light detection]
obstacle_detection[Obstacle detection]
end
trajectory_prediction[Trajectory prediction]
image_radar_lidar --> object_detection --> trajectory_prediction --> motion_planning --> steer_accelerate_brake[Steer / Accelerate / Brake]Roles and responsibilities in an AI team
Software Engineer
E.g., joke execution, timer execution, etc.
ML Engineer
Data gathering
Train a neural network
Test output
ML Researcher
Research new algorithms
Improve existing algorithms
Applied Learning Scientist
Somewhere b/w ML engineer and ML researcher
Data Scientist
Examine data and provide insights
Create dashboards, reports and presentations to team/executives.
Data Engineer
Organize data
Make sure data is saved securely, easily accessible and cost effectively.
AI product manager
Help decide what to build; whats feasible and valuable
You can start with a small team. You don't need a large team to start an AI project. Just you with a AI course and a dataset is enough to start.
AI transformation playbook
Execute small pilot projects to gain momentum
Can be in house or outsourced.
Show traction within 6/12 months.
Its more important for the first project to be successful, than to be big.
Build an in-house AI team
CEO, CAIO (Chief AI Officer)
AI team (central AI team)
Business unit 1
Business unit 2
Business unit 3 (gift card)
The central AI team, will be more like a consultancy, that helps the business units to implement AI in their projects.
They can help build company wise data infrastructures / platforms.
Better for AI team to have separate funding, rather than relying on the business units for funding.
Provide broad AI training
Executives and business leaders (What AI can do your enterprise, AI strategy, Resource allocation)
Pod leaders (Project direction, resource allocation, monitoring progress)
AI engineers (100hrs of training, Build and ship AI software, gather data, execute on specific AI projects)
Develop an AI strategy
Leverage AI to create an advantage specific to your industry sector.
Virtuous cycle of AI ( Better product --> More users --> More data --> Better product )
Consider creating a data strategy
Strategic data acquisition (Offer free services to collect data, Gmail)
Unified data warehouses (Collect data from all business units, and store it in a central place)
Create network effects and platform advantages
In industries where "winner takes all" is common, like social media, search engines, etc, AI can help accelerate the network effects.
Develop internal and external communications
Investor relations
Government relations (regulations, compliance, etc.)
Customer / user education
Talent / recruitment
Internal communications
AI pitfalls to avoid
Don't expect AI to solve all your problems, Be realistic about what AI can do.
Don't just hire 3/4 ML engineers and expect them to solve all your problems, You need a team with diverse skill sets, including data scientists, data engineers, software engineers, etc.
Don't expect AI projects to be successful in the first try, AI projects are iterative, you need to be prepared to fail and learn from your mistakes.
Traditional project planning doesn't work for AI projects, Work with your AI team to define the scope, timeline, and acceptance criteria for the project. AI KPIs are different from traditional software KPIs, you need to define them based on the AI project.
You don't need a superstar AI engineer to start an AI project, You can start with a small team, with online training.
Taking your first step in AI
Get friends to learn about AI
Start brainstorming projects
Hire a few ML/DS people to help
Hire or appoint an AI leader
Discuss with CEO/Board possibilities of AI transformation
Survey of major AI application areas
Supervised learning
Computer vision
Image classification (whole image is names) / Object recognition (parts of the image are named)
Facial recognition
Object detection, (finds position of objects in an image, and classifies them, draws a box around the object)
Image segmentation, (is this pixel part of a face? or a car? or a tree?, draws precise boundaries of objects in an image)
Tracking (follows objects in a video, like a car, or a person)
Natural language processing (NLP)
Text classification (spam detection, sentiment analysis, etc.)
Information retrieval (Search engines, question answering)
Name entity recognition (NER) (extracts names, dates, locations, etc. from text)
Machine translation (translates text from one language to another)
Speech processing
Microphone records very rapid air-pressure changes in the air
Takes as input audio, and outputs text
Trigger word detection (detects if a specific word is spoken, like "Alexa", "Hey Google", etc.)
Speaker ID, listens to someone speak and identifies who it is
Speech synthesis (text to speech, converts text to audio)
Generative AI
Creates high quality content, like images, text, audio, etc.
Input prompt, output content
Can create images, videos, text, audio, music, etc.
Robotics
Perception (figures out what is in the environment, based on sensor input data)
Motion planning (figures out how to move the robot, based on the perception data)
Control (executes the motion plan, and moves the robot)
General Machine learning
Unstructured data (images, audio, text, etc.)
Structured data (tabular data, like excel sheets, databases, etc.)
Unsupervised learning
Clustering
Price per packet vs No of packets sold
Detects purchase patterns in retail data
Groups similar items together, like customers, products, etc.
College kids purchase more energy drinks, and less coffee
Data is embedded in a high dimensional space, like price, quantity, location, etc.
Relationships between data points are constructed automatically, without any labels.
Can come up with new insights, like "customers who buy energy drinks also buy chips", or "customers who buy coffee also buy pastries".
Transfer learning
A model that is trained to detect cars with 100,000 images can be used to detect golf carts with 100 golf cart images.
Reinforcement learning
A drone leans to fly itself by trying different actions and getting feedback from the environment.
A pet dog learns to behave well by getting treats for good behavior and scolding for bad behavior.
Reinforcing good behavior and punishing bad behavior.
Uses a "reward signal" to tell when the AI is doing well or not.
Needs to re-iterate many times to learn the best actions. (We get a lot of data based on the training)
Generative adversarial networks (GANs)
Synthetic data generation
AI super models generation
Knowledge graph
A graph that represents knowledge in a structured way
Nodes represent entities, and edges represent relationships between entities
Smart speaker example
"Hey device, tell me a joke"
Trigger word detection, (input audio, output has_trigger_word_spoken)
Speech recognition, (input audio, output text)
Intent recognition, (input text, output intent (joke? time? music? call? weather?))
Execute action, (If its a joke, then get a joke from the database, and return it as text)
AI Pipeline
Rendering diagram...
"Hey device, set timer for 10 minutes"
Trigger word detection, (input audio, output has_trigger_word_spoken)
Speech recognition, (input audio, output text)
Intent recognition, (input text, output intent (set_timer?, timer_duration?))
Execute action, (If its a set_timer, then set the timer for timer_duration minutes)
Rendering diagram...
Self driving cars
Image / Radar / Lidar
Object detection, (car detection, pedestrian detection, traffic sign detection, etc.)
Lane detection, (detect the lanes on the road)
Outputs the position of the lanes.
Trajectory prediction, (predict where the detected objects will be in the future)
Outputs the predicted position and speed of the detected objects.
Motion planning, (how to move the car, based on the detected objects, without collisions)
Outputs the path and speed of the car.
Path should avoid obstacles, and follow traffic rules.
Steer / Accelerate / Brake
Rendering diagram...
Rendering diagram...
Roles and responsibilities in an AI team
Software Engineer
E.g., joke execution, timer execution, etc.
ML Engineer
Data gathering
Train a neural network
Test output
ML Researcher
Research new algorithms
Improve existing algorithms
Applied Learning Scientist
Somewhere b/w ML engineer and ML researcher
Data Scientist
Examine data and provide insights
Create dashboards, reports and presentations to team/executives.
Data Engineer
Organize data
Make sure data is saved securely, easily accessible and cost effectively.
AI product manager
Help decide what to build; whats feasible and valuable
You can start with a small team. You don't need a large team to start an AI project. Just you with a AI course and a dataset is enough to start.
AI transformation playbook
Execute small pilot projects to gain momentum
Can be in house or outsourced.
Show traction within 6/12 months.
Its more important for the first project to be successful, than to be big.
Build an in-house AI team
CEO, CAIO (Chief AI Officer)
AI team (central AI team)
Business unit 1
Business unit 2
Business unit 3 (gift card)
The central AI team, will be more like a consultancy, that helps the business units to implement AI in their projects.
They can help build company wise data infrastructures / platforms.
Better for AI team to have separate funding, rather than relying on the business units for funding.
Provide broad AI training
Executives and business leaders (What AI can do your enterprise, AI strategy, Resource allocation)
Pod leaders (Project direction, resource allocation, monitoring progress)
AI engineers (100hrs of training, Build and ship AI software, gather data, execute on specific AI projects)
Develop an AI strategy
Leverage AI to create an advantage specific to your industry sector.
Virtuous cycle of AI ( Better product --> More users --> More data --> Better product )
Consider creating a data strategy
Strategic data acquisition (Offer free services to collect data, Gmail)
Unified data warehouses (Collect data from all business units, and store it in a central place)
Create network effects and platform advantages
In industries where "winner takes all" is common, like social media, search engines, etc, AI can help accelerate the network effects.
Develop internal and external communications
Investor relations
Government relations (regulations, compliance, etc.)
Customer / user education
Talent / recruitment
Internal communications
AI pitfalls to avoid
Don't expect AI to solve all your problems, Be realistic about what AI can do.
Don't just hire 3/4 ML engineers and expect them to solve all your problems, You need a team with diverse skill sets, including data scientists, data engineers, software engineers, etc.
Don't expect AI projects to be successful in the first try, AI projects are iterative, you need to be prepared to fail and learn from your mistakes.
Traditional project planning doesn't work for AI projects, Work with your AI team to define the scope, timeline, and acceptance criteria for the project. AI KPIs are different from traditional software KPIs, you need to define them based on the AI project.
You don't need a superstar AI engineer to start an AI project, You can start with a small team, with online training.
Taking your first step in AI
Get friends to learn about AI
Start brainstorming projects
Hire a few ML/DS people to help
Hire or appoint an AI leader
Discuss with CEO/Board possibilities of AI transformation
Survey of major AI application areas
Supervised learning
Computer vision
Image classification (whole image is names) / Object recognition (parts of the image are named)
Facial recognition
Object detection, (finds position of objects in an image, and classifies them, draws a box around the object)
Image segmentation, (is this pixel part of a face? or a car? or a tree?, draws precise boundaries of objects in an image)
Tracking (follows objects in a video, like a car, or a person)
Natural language processing (NLP)
Text classification (spam detection, sentiment analysis, etc.)
Information retrieval (Search engines, question answering)
Name entity recognition (NER) (extracts names, dates, locations, etc. from text)
Machine translation (translates text from one language to another)
Speech processing
Microphone records very rapid air-pressure changes in the air
Takes as input audio, and outputs text
Trigger word detection (detects if a specific word is spoken, like "Alexa", "Hey Google", etc.)
Speaker ID, listens to someone speak and identifies who it is
Speech synthesis (text to speech, converts text to audio)
Generative AI
Creates high quality content, like images, text, audio, etc.
Input prompt, output content
Can create images, videos, text, audio, music, etc.
Robotics
Perception (figures out what is in the environment, based on sensor input data)
Motion planning (figures out how to move the robot, based on the perception data)
Control (executes the motion plan, and moves the robot)
General Machine learning
Unstructured data (images, audio, text, etc.)
Structured data (tabular data, like excel sheets, databases, etc.)
Unsupervised learning
Clustering
Price per packet vs No of packets sold
Detects purchase patterns in retail data
Groups similar items together, like customers, products, etc.
College kids purchase more energy drinks, and less coffee
Data is embedded in a high dimensional space, like price, quantity, location, etc.
Relationships between data points are constructed automatically, without any labels.
Can come up with new insights, like "customers who buy energy drinks also buy chips", or "customers who buy coffee also buy pastries".
Transfer learning
A model that is trained to detect cars with 100,000 images can be used to detect golf carts with 100 golf cart images.
Reinforcement learning
A drone leans to fly itself by trying different actions and getting feedback from the environment.
A pet dog learns to behave well by getting treats for good behavior and scolding for bad behavior.
Reinforcing good behavior and punishing bad behavior.
Uses a "reward signal" to tell when the AI is doing well or not.
Needs to re-iterate many times to learn the best actions. (We get a lot of data based on the training)
Generative adversarial networks (GANs)
Synthetic data generation
AI super models generation
Knowledge graph
A graph that represents knowledge in a structured way
Nodes represent entities, and edges represent relationships between entities
Supervised learning
Computer vision
Image classification (whole image is names) / Object recognition (parts of the image are named)
Facial recognition
Object detection, (finds position of objects in an image, and classifies them, draws a box around the object)
Image segmentation, (is this pixel part of a face? or a car? or a tree?, draws precise boundaries of objects in an image)
Tracking (follows objects in a video, like a car, or a person)
Natural language processing (NLP)
Text classification (spam detection, sentiment analysis, etc.)
Information retrieval (Search engines, question answering)
Name entity recognition (NER) (extracts names, dates, locations, etc. from text)
Machine translation (translates text from one language to another)
Speech processing
Microphone records very rapid air-pressure changes in the air
Takes as input audio, and outputs text
Trigger word detection (detects if a specific word is spoken, like "Alexa", "Hey Google", etc.)
Speaker ID, listens to someone speak and identifies who it is
Speech synthesis (text to speech, converts text to audio)
Generative AI
Creates high quality content, like images, text, audio, etc.
Input prompt, output content
Can create images, videos, text, audio, music, etc.
Robotics
Perception (figures out what is in the environment, based on sensor input data)
Motion planning (figures out how to move the robot, based on the perception data)
Control (executes the motion plan, and moves the robot)
General Machine learning
Unstructured data (images, audio, text, etc.)
Structured data (tabular data, like excel sheets, databases, etc.)
Unsupervised learning
Clustering
Price per packet vs No of packets sold
Detects purchase patterns in retail data
Groups similar items together, like customers, products, etc.
College kids purchase more energy drinks, and less coffee
Data is embedded in a high dimensional space, like price, quantity, location, etc.
Relationships between data points are constructed automatically, without any labels.
Can come up with new insights, like "customers who buy energy drinks also buy chips", or "customers who buy coffee also buy pastries".
Transfer learning
A model that is trained to detect cars with 100,000 images can be used to detect golf carts with 100 golf cart images.
Reinforcement learning
A drone leans to fly itself by trying different actions and getting feedback from the environment.
A pet dog learns to behave well by getting treats for good behavior and scolding for bad behavior.
Reinforcing good behavior and punishing bad behavior.
Uses a "reward signal" to tell when the AI is doing well or not.
Needs to re-iterate many times to learn the best actions. (We get a lot of data based on the training)
Generative adversarial networks (GANs)
Synthetic data generation
AI super models generation
Knowledge graph
A graph that represents knowledge in a structured way
Nodes represent entities, and edges represent relationships between entities
AI and Society
No title
AI and hype
We should neither be optimistic or pessimistic about AI.
AI is a very powerful tool, but it has its limitations. We can mitigate its potential harms and use it to create tremendous value.
Limitations of AI
Explainablity is hard (AI needs to explain why it made a certain decision)
Bias, (If an AI is trained on biased data, it will produce biased results)
Susceptible to Adversarial attacks
AI, developing economies and jobs
Bias
AI learning unhealthy stereotypes.
AI can be racist, sexist, and biased, from data.
This is because training data has more associations for men with programming than with women.
If a face recognition system is trained on a dataset that has more images of white faces than black faces, it will perform better on white faces than black faces.
Banks may suggest lower credit limits for black people than white people, even if they have the same credit score.
An resume screening AI may favor more men than women, if its training data is biased.
Reducing bias in AI systems is paramount.
Combating bias
Zero out the bias in the words (Lets say "White programmer" is associated with 0.8, and "Black programmer" is associated with 0.2, then we can zero out the bias by making both associations equal to 0.5, in the data space)
Use less biased data.
Use a more inclusive data. (Make sure most races are represented in the data)
Audit to figure out if the AI is biased.
Diverse workforce. Having more inclusive workforce, can help reduce bias in AI systems.
Adversarial attacks on AI
AI can be fooled to spit out sensitive information.
AI can classify a hummingbird as a hammer, by making minor perturbation (changes) to the image.
Physical attacks, like putting on a specific sticker on a stop sign, can make the AI think its a speed limit sign.
Putting on a certain type of glasses can make the AI think its a different person.
AI can be fooled to misclassify images, by adding noise to the image.
Defenses
Ongoing research.
Like a spam vs anti-spam, we may be in a arms race for some application.
AI generated video detector, to detect if a video is real or fake.
Adverse uses of AI
DeepFakes
Oppressive surveillance
Fake reviews / Fake comments (political bots)
Spam vs Anti Spam; Fraud vs Anti Fraud;
AI and developing economies
There will be less opportunities for low-skilled workers.
AI will automate away certain jobs, like data entry, customer support, etc.
AI and jobs
There is a lat of uncertainty about how AI will impact jobs.
AI will create more jobs than it will displace.
No title
AI and hype
We should neither be optimistic or pessimistic about AI.
AI is a very powerful tool, but it has its limitations. We can mitigate its potential harms and use it to create tremendous value.
Limitations of AI
Explainablity is hard (AI needs to explain why it made a certain decision)
Bias, (If an AI is trained on biased data, it will produce biased results)
Susceptible to Adversarial attacks
AI, developing economies and jobs
Bias
AI learning unhealthy stereotypes.
AI can be racist, sexist, and biased, from data.
This is because training data has more associations for men with programming than with women.
If a face recognition system is trained on a dataset that has more images of white faces than black faces, it will perform better on white faces than black faces.
Banks may suggest lower credit limits for black people than white people, even if they have the same credit score.
An resume screening AI may favor more men than women, if its training data is biased.
Reducing bias in AI systems is paramount.
Combating bias
Zero out the bias in the words (Lets say "White programmer" is associated with 0.8, and "Black programmer" is associated with 0.2, then we can zero out the bias by making both associations equal to 0.5, in the data space)
Use less biased data.
Use a more inclusive data. (Make sure most races are represented in the data)
Audit to figure out if the AI is biased.
Diverse workforce. Having more inclusive workforce, can help reduce bias in AI systems.
Adversarial attacks on AI
AI can be fooled to spit out sensitive information.
AI can classify a hummingbird as a hammer, by making minor perturbation (changes) to the image.
Physical attacks, like putting on a specific sticker on a stop sign, can make the AI think its a speed limit sign.
Putting on a certain type of glasses can make the AI think its a different person.
AI can be fooled to misclassify images, by adding noise to the image.
Defenses
Ongoing research.
Like a spam vs anti-spam, we may be in a arms race for some application.
AI generated video detector, to detect if a video is real or fake.
Adverse uses of AI
DeepFakes
Oppressive surveillance
Fake reviews / Fake comments (political bots)
Spam vs Anti Spam; Fraud vs Anti Fraud;
AI and developing economies
There will be less opportunities for low-skilled workers.
AI will automate away certain jobs, like data entry, customer support, etc.
Superwised learnings
email --> is_spam (spam filtering)
audio --> text transcription (speech recognition)
english --> french (machine translation)
visual inspection --> defect detection (quality control)
sequence of words --> next word prediction (chatbot)
LLM's
repeated predicts the next word
input, output.
my favorite drink, is.
my favorite drink is coffee.
my favorite drink is coffee, and I like it because it is.
output is sanitised, to be not offensive, or harmful.
Neural networks have improved the quality of LLMs, as you train them on more data, they get better.
What is data? (dataset)
No title
Data is unique to your business.
What is input and what is output?
size_of_house + no_of_bedrooms + location --> price_of_house
How do acquire data?
Manual labeling
Observing user behaviors user_id, time, price, has_purchased?
Observing machine behaviors machine_id, time, temperature, pressure, has_failed? Can tell if the machine is going to fail, based on the data.
Download it from websites / partnerships, (keep in mind licensing and copyright) Kaggle, UCI Machine Learning Repository, etc.
All data is not created equal, some data is more valuable than others.
More data doesn't always mean better results, quality of data is more important than quantity.
Data can have missing values, outliers, and noise. ( we need to clean up data before processing it )
Some time of data like images, audio, text, are unstructured data, which means they don't have a predefined format.
Techniques to deal with unstructured data is different from structured data.
How do acquire data?
Manual labeling
Observing user behaviors user_id, time, price, has_purchased?
Observing machine behaviors machine_id, time, temperature, pressure, has_failed? Can tell if the machine is going to fail, based on the data.
Download it from websites / partnerships, (keep in mind licensing and copyright) Kaggle, UCI Machine Learning Repository, etc.
All data is not created equal, some data is more valuable than others.
More data doesn't always mean better results, quality of data is more important than quantity.
Data can have missing values, outliers, and noise. ( we need to clean up data before processing it )
Some time of data like images, audio, text, are unstructured data, which means they don't have a predefined format.
Techniques to deal with unstructured data is different from structured data.
The terminology of AI
ML vs data science
ML algos that uses data-sets, to get output from input. size_of_house + no_of_bedrooms + location --> price_of_house Field of study that gives computers the ability to learn without being explicitly programmed. Results in a software model that can execute to generate outputs for a certain input.
Data science team analyses data sets to find insights. "Did you know newly renovated houses sell for 20% more than non-renovated houses?" Science of extracting knowledge and insights from data. Side decks / Pitch decks for investors, to show them the insights.
Deep learning
Artificial neural networks.
Nodes and connections, similar to how the human brain works.
Its completely unrelated to how actual human brain works.
Takes input and outputs a prediction.
[[ Deep learning ], Machine learning, other tools], AI, other tools like knowledge graphs, rule-based systems, etc.]
ML vs data science
ML algos that uses data-sets, to get output from input. size_of_house + no_of_bedrooms + location --> price_of_house Field of study that gives computers the ability to learn without being explicitly programmed. Results in a software model that can execute to generate outputs for a certain input.
Data science team analyses data sets to find insights. "Did you know newly renovated houses sell for 20% more than non-renovated houses?" Science of extracting knowledge and insights from data. Side decks / Pitch decks for investors, to show them the insights.
Deep learning
Artificial neural networks.
Nodes and connections, similar to how the human brain works.
Its completely unrelated to how actual human brain works.
Takes input and outputs a prediction.
[[ Deep learning ], Machine learning, other tools], AI, other tools like knowledge graphs, rule-based systems, etc.]
What makes an AI company?
What makes a good internet company?
Shopping mall + Website != Internet company
A/B testing
Short iteration cycles
Data-driven decision making (engineers and product managers make decisions based on data, not executive opinions)
What makes a good AI company?
Strategic data acquisition
Unified data platform
Using AI to automate repetitive tasks
New roles and responsibilities (ML engineers, data scientists, etc.)
What makes a good internet company?
Shopping mall + Website != Internet company
A/B testing
Short iteration cycles
Data-driven decision making (engineers and product managers make decisions based on data, not executive opinions)
What makes a good AI company?
Strategic data acquisition
Unified data platform
Using AI to automate repetitive tasks
New roles and responsibilities (ML engineers, data scientists, etc.)
What ML can and cannot do?
No title
spam? (0/1)
spam filtering
audio
text transcripts
speech recognition
English
Chinese
machine translation
ad, user info
click? (0/1)
online advertising
image, radar info
position of other cars
Self-driving car
image of phone
defect? (0/1)
visual inspection
sequence of words
the next word
chatbot
Imperfect rule of thumb: If a human can do it less than one sec of thought, then ML can do it too.
Feasible
Learning simple concepts
Having lots of data
eg:
Self driving cars, can guess where the other cars it saw are, based on their previous positions and what in front of them.
Not feasible
Complex concepts
With less sample data
eg:
If AI is trained on a certain type of data, it will not work on other types of data. (eg, if an AI is trained on lateral chest xrays to detect pneumonia, it will not work on frontal chest x-rays, or x-rays not aligned properly)
Human gestures interpretation, like roadworker stopping a car, is not feasible for AI to learn, as it requires complex understanding of human behavior and context, with large amount of data. (Even we struggle to understand it sometimes)
No title
spam? (0/1)
spam filtering
audio
text transcripts
speech recognition
English
Chinese
machine translation
ad, user info
click? (0/1)
online advertising
image, radar info
position of other cars
Self-driving car
image of phone
defect? (0/1)
visual inspection
sequence of words
the next word
chatbot
Imperfect rule of thumb: If a human can do it less than one sec of thought, then ML can do it too.
Feasible
Learning simple concepts
Having lots of data
eg:
Self driving cars, can guess where the other cars it saw are, based on their previous positions and what in front of them.
Not feasible
Complex concepts
With less sample data
eg:
If AI is trained on a certain type of data, it will not work on other types of data. (eg, if an AI is trained on lateral chest xrays to detect pneumonia, it will not work on frontal chest x-rays, or x-rays not aligned properly)
Human gestures interpretation, like roadworker stopping a car, is not feasible for AI to learn, as it requires complex understanding of human behavior and context, with large amount of data. (Even we struggle to understand it sometimes)
Non technical explanation of deep learning
Rendering diagram...
The above neural network is a simple example of how deep learning works. It takes multiple inputs (price, shipping cost, marketing, material) and processes them through nodes (N1, N2, N3) to produce an output (demand).
It figures out the relationships on its self.
Feed lots of input for (price, shipping cost, marketing, material, demand) and it will figure out the relationships on its own.
For eg in a facedetection model, it will figure out the relationships between pixels, edges, and shapes to identify parts of the face, like eyes, nose, mouth, etc. And then combine them to identify the face as a whole.
Rendering diagram...
Staring point of an AI project
What is the workflow of a machine learning project?
Key steps of a machine learning project
Alexa
Collect data of people saying (Alexa or other trigger words)
Train the model to recognize the trigger words (Iterate many times until its good enough)
Deploy model (Get data back, maintain and update model) (may or may not be possible, depending on the privacy and security policies of the company)
Self-driving car
Collect data of car positions (Red squares on pictures)
Train the model to predict the position of other cars
Deploy model
What is the workflow of a data science project?
Optimizing a sales funnel
Collect data of user behavior on the website (visits, clicks, time spent, etc.)
Analyze the data to find insights "Overseas users leave when they find high shipping costs" "Spend fewer marketing dollars on overseas users"
Suggest hypotheses to improve the sales funnel "Reduce shipping costs for overseas users" Re-analyze the data to see if the changes had an impact
Optimizing the manufacturing line
Mix clay, shape mug, add glaze, fire kiln, final inspection
Collect data of the manufacturing process (temperature, pressure, time, etc.)
Analyze the data to find insights "High temperature leads to more defects" "Reduce temperature to reduce defects" "Because ambient temperature is warmer in the afternoon, we need to reduce the temperature in the afternoon"
Suggest hypotheses to improve the manufacturing process, or yield "Reduce temperature to reduce defects" Re-analyze the data to see if the changes had an impact
Every job function needs to learn how to use data
Data science can optimise the sales funnel, machine learning, can automate lead sorting.
Data science can help optimize the manufacturing process, machine learning can automate quality control.
Data science can help optimize the recruiting funnel, machine learning can automate resume screening. (Your system should be ethical and not biased)
Data science can help optimise the user-experience of a website, machine learning can automate content recommendations, can suggest push notifications, etc.
Data science can help suggest what to plant when and where, machine learning can detect where weeds are and help automate weeding.
How to choose an AI project?
No title
What AI can do (AI experts)
Valuable for your business (domain experts)
Select a project that's overlapping
Framework for brainstorming AI projects
Thinking about automating tasks rather than automating jobs.
What are the main drivers of business value?
What are the main pain points in your business?
You can make progress even without big data
Having more data almost never hurts.
Data makes some businesses defensible. (difficult for new players to come in)
Even with small datasets, you can still make progress.
Due diligence before starting an AI project
What can AI do?, Whats? valuable for your business?
Should overlap b/w the above steps.
Technical diligence
Can a AI system meet desired performance? (eg, 95% accuracy)
How much data is needed to achieve that performance?
Engineering timeline
Business diligence
Lowering costs
Increases revenue
Launch a new product or service
Ethical diligence
Does it make the society better?
Build vs Buy?
ML projects can be in-house or outsourced.
DS projects are more commonly in-house. (Its so closely tied to your business, it makes sense to keep it in-house)
Some things will be industry standard, don't reinvent the wheel. (Don't try to outrun a train)
Working with an AI team
Specify an acceptance criteria
Goal: detect defects in coffee mugs, in 95% of cases. (statistically, avg)
Provide AI team with a dataset to measure performance. (test set, doesn't have to be tool large, 1000-2000 samples is enough)
Data
Training set (ok, defect, used to train the model, and create A -> B mapping)
Test set (will not be used to train the model, used to measure performance of the model, should be representative of the real world data)
Don't expect 100% accuracy
Limitation of ML
Insufficient data
Mislabeled labels
Ambiguous labels
AI tools
ML Frameworks
PyTorch, TensorFlow, Hugging Farce, PaddlePaddle, Scikit-learn, R.
Reasearch Publications
Arxiv
Open source projects
Github
What is the workflow of a machine learning project?
Key steps of a machine learning project
Alexa
Collect data of people saying (Alexa or other trigger words)
Train the model to recognize the trigger words (Iterate many times until its good enough)
Deploy model (Get data back, maintain and update model) (may or may not be possible, depending on the privacy and security policies of the company)
Self-driving car
Collect data of car positions (Red squares on pictures)
Train the model to predict the position of other cars
Deploy model
Key steps of a machine learning project
Alexa
Collect data of people saying (Alexa or other trigger words)
Train the model to recognize the trigger words (Iterate many times until its good enough)
Deploy model (Get data back, maintain and update model) (may or may not be possible, depending on the privacy and security policies of the company)
Self-driving car
Collect data of car positions (Red squares on pictures)
Train the model to predict the position of other cars
Deploy model
What is the workflow of a data science project?
Optimizing a sales funnel
Collect data of user behavior on the website (visits, clicks, time spent, etc.)
Analyze the data to find insights "Overseas users leave when they find high shipping costs" "Spend fewer marketing dollars on overseas users"
Suggest hypotheses to improve the sales funnel "Reduce shipping costs for overseas users" Re-analyze the data to see if the changes had an impact
Optimizing the manufacturing line
Mix clay, shape mug, add glaze, fire kiln, final inspection
Collect data of the manufacturing process (temperature, pressure, time, etc.)
Analyze the data to find insights "High temperature leads to more defects" "Reduce temperature to reduce defects" "Because ambient temperature is warmer in the afternoon, we need to reduce the temperature in the afternoon"
Suggest hypotheses to improve the manufacturing process, or yield "Reduce temperature to reduce defects" Re-analyze the data to see if the changes had an impact
Every job function needs to learn how to use data
Data science can optimise the sales funnel, machine learning, can automate lead sorting.
Data science can help optimize the manufacturing process, machine learning can automate quality control.
Data science can help optimize the recruiting funnel, machine learning can automate resume screening. (Your system should be ethical and not biased)
Data science can help optimise the user-experience of a website, machine learning can automate content recommendations, can suggest push notifications, etc.
Data science can help suggest what to plant when and where, machine learning can detect where weeds are and help automate weeding.
How to choose an AI project?
No title
What AI can do (AI experts)
Valuable for your business (domain experts)
Select a project that's overlapping
Framework for brainstorming AI projects
Thinking about automating tasks rather than automating jobs.
What are the main drivers of business value?
What are the main pain points in your business?
You can make progress even without big data
Having more data almost never hurts.
Data makes some businesses defensible. (difficult for new players to come in)
Even with small datasets, you can still make progress.
Due diligence before starting an AI project
What can AI do?, Whats? valuable for your business?
Should overlap b/w the above steps.
Technical diligence
Can a AI system meet desired performance? (eg, 95% accuracy)
How much data is needed to achieve that performance?
Engineering timeline
Business diligence
Lowering costs
Increases revenue
Launch a new product or service
Ethical diligence
Does it make the society better?
Build vs Buy?
ML projects can be in-house or outsourced.
DS projects are more commonly in-house. (Its so closely tied to your business, it makes sense to keep it in-house)
Some things will be industry standard, don't reinvent the wheel. (Don't try to outrun a train)
Framework for brainstorming AI projects
Thinking about automating tasks rather than automating jobs.
What are the main drivers of business value?
What are the main pain points in your business?
You can make progress even without big data
Having more data almost never hurts.
Data makes some businesses defensible. (difficult for new players to come in)
Even with small datasets, you can still make progress.
Due diligence before starting an AI project
What can AI do?, Whats? valuable for your business?
Should overlap b/w the above steps.
Technical diligence
Can a AI system meet desired performance? (eg, 95% accuracy)
How much data is needed to achieve that performance?
Engineering timeline
Build vs Buy?
ML projects can be in-house or outsourced.
DS projects are more commonly in-house. (Its so closely tied to your business, it makes sense to keep it in-house)
Some things will be industry standard, don't reinvent the wheel. (Don't try to outrun a train)
Working with an AI team
Specify an acceptance criteria
Goal: detect defects in coffee mugs, in 95% of cases. (statistically, avg)
Provide AI team with a dataset to measure performance. (test set, doesn't have to be tool large, 1000-2000 samples is enough)
Data
Training set (ok, defect, used to train the model, and create A -> B mapping)
Test set (will not be used to train the model, used to measure performance of the model, should be representative of the real world data)
Don't expect 100% accuracy
Limitation of ML
Insufficient data
Mislabeled labels
Ambiguous labels
AI tools
ML Frameworks
PyTorch, TensorFlow, Hugging Farce, PaddlePaddle, Scikit-learn, R.
Reasearch Publications
Arxiv
Open source projects
Github
Building AI in your company
Smart speaker example
"Hey device, tell me a joke"
Trigger word detection, (input audio, output has_trigger_word_spoken)
Speech recognition, (input audio, output text)
Intent recognition, (input text, output intent (joke? time? music? call? weather?))
Execute action, (If its a joke, then get a joke from the database, and return it as text)
AI Pipeline
flowchart LR
trigger_word_detection[Trigger word detection] --> speech_recognition[Speech recognition] --> intent_recognition[Intent recognition] --> execute_action[Execute action]"Hey device, set timer for 10 minutes"
Trigger word detection, (input audio, output has_trigger_word_spoken)
Speech recognition, (input audio, output text)
Intent recognition, (input text, output intent (set_timer?, timer_duration?))
Execute action, (If its a set_timer, then set the timer for timer_duration minutes)
Self driving cars
Image / Radar / Lidar
Object detection, (car detection, pedestrian detection, traffic sign detection, etc.)
Lane detection, (detect the lanes on the road)
Outputs the position of the lanes.
Trajectory prediction, (predict where the detected objects will be in the future)
Outputs the predicted position and speed of the detected objects.
Motion planning, (how to move the car, based on the detected objects, without collisions)
Outputs the path and speed of the car.
Path should avoid obstacles, and follow traffic rules.
Steer / Accelerate / Brake
flowchart LR
image_radar_lidar[Sensor data]
subgraph object_detection
lane_detection[Lane detection]
car_detection[Car detection]
pedestrian_detection[Pedestrian detection]
traffic_light_detection[Traffic light detection]
obstacle_detection[Obstacle detection]
end
trajectory_prediction[Trajectory prediction]
image_radar_lidar --> object_detection --> trajectory_prediction --> motion_planning --> steer_accelerate_brake[Steer / Accelerate / Brake]Roles and responsibilities in an AI team
Software Engineer
E.g., joke execution, timer execution, etc.
ML Engineer
Data gathering
Train a neural network
Test output
ML Researcher
Research new algorithms
Improve existing algorithms
Applied Learning Scientist
Somewhere b/w ML engineer and ML researcher
Data Scientist
Examine data and provide insights
Create dashboards, reports and presentations to team/executives.
Data Engineer
Organize data
Make sure data is saved securely, easily accessible and cost effectively.
AI product manager
Help decide what to build; whats feasible and valuable
You can start with a small team. You don't need a large team to start an AI project. Just you with a AI course and a dataset is enough to start.
AI transformation playbook
Execute small pilot projects to gain momentum
Can be in house or outsourced.
Show traction within 6/12 months.
Its more important for the first project to be successful, than to be big.
Build an in-house AI team
CEO, CAIO (Chief AI Officer)
AI team (central AI team)
Business unit 1
Business unit 2
Business unit 3 (gift card)
The central AI team, will be more like a consultancy, that helps the business units to implement AI in their projects.
They can help build company wise data infrastructures / platforms.
Better for AI team to have separate funding, rather than relying on the business units for funding.
Provide broad AI training
Executives and business leaders (What AI can do your enterprise, AI strategy, Resource allocation)
Pod leaders (Project direction, resource allocation, monitoring progress)
AI engineers (100hrs of training, Build and ship AI software, gather data, execute on specific AI projects)
Develop an AI strategy
Leverage AI to create an advantage specific to your industry sector.
Virtuous cycle of AI ( Better product --> More users --> More data --> Better product )
Consider creating a data strategy
Strategic data acquisition (Offer free services to collect data, Gmail)
Unified data warehouses (Collect data from all business units, and store it in a central place)
Create network effects and platform advantages
In industries where "winner takes all" is common, like social media, search engines, etc, AI can help accelerate the network effects.
Develop internal and external communications
Investor relations
Government relations (regulations, compliance, etc.)
Customer / user education
Talent / recruitment
Internal communications
AI pitfalls to avoid
Don't expect AI to solve all your problems, Be realistic about what AI can do.
Don't just hire 3/4 ML engineers and expect them to solve all your problems, You need a team with diverse skill sets, including data scientists, data engineers, software engineers, etc.
Don't expect AI projects to be successful in the first try, AI projects are iterative, you need to be prepared to fail and learn from your mistakes.
Traditional project planning doesn't work for AI projects, Work with your AI team to define the scope, timeline, and acceptance criteria for the project. AI KPIs are different from traditional software KPIs, you need to define them based on the AI project.
You don't need a superstar AI engineer to start an AI project, You can start with a small team, with online training.
Taking your first step in AI
Get friends to learn about AI
Start brainstorming projects
Hire a few ML/DS people to help
Hire or appoint an AI leader
Discuss with CEO/Board possibilities of AI transformation
Survey of major AI application areas
Supervised learning
Computer vision
Image classification (whole image is names) / Object recognition (parts of the image are named)
Facial recognition
Object detection, (finds position of objects in an image, and classifies them, draws a box around the object)
Image segmentation, (is this pixel part of a face? or a car? or a tree?, draws precise boundaries of objects in an image)
Tracking (follows objects in a video, like a car, or a person)
Natural language processing (NLP)
Text classification (spam detection, sentiment analysis, etc.)
Information retrieval (Search engines, question answering)
Name entity recognition (NER) (extracts names, dates, locations, etc. from text)
Machine translation (translates text from one language to another)
Speech processing
Microphone records very rapid air-pressure changes in the air
Takes as input audio, and outputs text
Trigger word detection (detects if a specific word is spoken, like "Alexa", "Hey Google", etc.)
Speaker ID, listens to someone speak and identifies who it is
Speech synthesis (text to speech, converts text to audio)
Generative AI
Creates high quality content, like images, text, audio, etc.
Input prompt, output content
Can create images, videos, text, audio, music, etc.
Robotics
Perception (figures out what is in the environment, based on sensor input data)
Motion planning (figures out how to move the robot, based on the perception data)
Control (executes the motion plan, and moves the robot)
General Machine learning
Unstructured data (images, audio, text, etc.)
Structured data (tabular data, like excel sheets, databases, etc.)
Unsupervised learning
Clustering
Price per packet vs No of packets sold
Detects purchase patterns in retail data
Groups similar items together, like customers, products, etc.
College kids purchase more energy drinks, and less coffee
Data is embedded in a high dimensional space, like price, quantity, location, etc.
Relationships between data points are constructed automatically, without any labels.
Can come up with new insights, like "customers who buy energy drinks also buy chips", or "customers who buy coffee also buy pastries".
Transfer learning
A model that is trained to detect cars with 100,000 images can be used to detect golf carts with 100 golf cart images.
Reinforcement learning
A drone leans to fly itself by trying different actions and getting feedback from the environment.
A pet dog learns to behave well by getting treats for good behavior and scolding for bad behavior.
Reinforcing good behavior and punishing bad behavior.
Uses a "reward signal" to tell when the AI is doing well or not.
Needs to re-iterate many times to learn the best actions. (We get a lot of data based on the training)
Generative adversarial networks (GANs)
Synthetic data generation
AI super models generation
Knowledge graph
A graph that represents knowledge in a structured way
Nodes represent entities, and edges represent relationships between entities
Smart speaker example
"Hey device, tell me a joke"
Trigger word detection, (input audio, output has_trigger_word_spoken)
Speech recognition, (input audio, output text)
Intent recognition, (input text, output intent (joke? time? music? call? weather?))
Execute action, (If its a joke, then get a joke from the database, and return it as text)
AI Pipeline
Rendering diagram...
"Hey device, set timer for 10 minutes"
Trigger word detection, (input audio, output has_trigger_word_spoken)
Speech recognition, (input audio, output text)
Intent recognition, (input text, output intent (set_timer?, timer_duration?))
Execute action, (If its a set_timer, then set the timer for timer_duration minutes)
Rendering diagram...
Self driving cars
Image / Radar / Lidar
Object detection, (car detection, pedestrian detection, traffic sign detection, etc.)
Lane detection, (detect the lanes on the road)
Outputs the position of the lanes.
Trajectory prediction, (predict where the detected objects will be in the future)
Outputs the predicted position and speed of the detected objects.
Motion planning, (how to move the car, based on the detected objects, without collisions)
Outputs the path and speed of the car.
Path should avoid obstacles, and follow traffic rules.
Steer / Accelerate / Brake
Rendering diagram...
Rendering diagram...
Roles and responsibilities in an AI team
Software Engineer
E.g., joke execution, timer execution, etc.
ML Engineer
Data gathering
Train a neural network
Test output
ML Researcher
Research new algorithms
Improve existing algorithms
Applied Learning Scientist
Somewhere b/w ML engineer and ML researcher
Data Scientist
Examine data and provide insights
Create dashboards, reports and presentations to team/executives.
Data Engineer
Organize data
Make sure data is saved securely, easily accessible and cost effectively.
AI product manager
Help decide what to build; whats feasible and valuable
You can start with a small team. You don't need a large team to start an AI project. Just you with a AI course and a dataset is enough to start.
AI transformation playbook
Execute small pilot projects to gain momentum
Can be in house or outsourced.
Show traction within 6/12 months.
Its more important for the first project to be successful, than to be big.
Build an in-house AI team
CEO, CAIO (Chief AI Officer)
AI team (central AI team)
Business unit 1
Business unit 2
Business unit 3 (gift card)
The central AI team, will be more like a consultancy, that helps the business units to implement AI in their projects.
They can help build company wise data infrastructures / platforms.
Better for AI team to have separate funding, rather than relying on the business units for funding.
Provide broad AI training
Executives and business leaders (What AI can do your enterprise, AI strategy, Resource allocation)
Pod leaders (Project direction, resource allocation, monitoring progress)
AI engineers (100hrs of training, Build and ship AI software, gather data, execute on specific AI projects)
Develop an AI strategy
Leverage AI to create an advantage specific to your industry sector.
Virtuous cycle of AI ( Better product --> More users --> More data --> Better product )
Consider creating a data strategy
Strategic data acquisition (Offer free services to collect data, Gmail)
Unified data warehouses (Collect data from all business units, and store it in a central place)
Create network effects and platform advantages
In industries where "winner takes all" is common, like social media, search engines, etc, AI can help accelerate the network effects.
Develop internal and external communications
Investor relations
Government relations (regulations, compliance, etc.)
Customer / user education
Talent / recruitment
Internal communications
AI pitfalls to avoid
Don't expect AI to solve all your problems, Be realistic about what AI can do.
Don't just hire 3/4 ML engineers and expect them to solve all your problems, You need a team with diverse skill sets, including data scientists, data engineers, software engineers, etc.
Don't expect AI projects to be successful in the first try, AI projects are iterative, you need to be prepared to fail and learn from your mistakes.
Traditional project planning doesn't work for AI projects, Work with your AI team to define the scope, timeline, and acceptance criteria for the project. AI KPIs are different from traditional software KPIs, you need to define them based on the AI project.
You don't need a superstar AI engineer to start an AI project, You can start with a small team, with online training.
Taking your first step in AI
Get friends to learn about AI
Start brainstorming projects
Hire a few ML/DS people to help
Hire or appoint an AI leader
Discuss with CEO/Board possibilities of AI transformation
Survey of major AI application areas
Supervised learning
Computer vision
Image classification (whole image is names) / Object recognition (parts of the image are named)
Facial recognition
Object detection, (finds position of objects in an image, and classifies them, draws a box around the object)
Image segmentation, (is this pixel part of a face? or a car? or a tree?, draws precise boundaries of objects in an image)
Tracking (follows objects in a video, like a car, or a person)
Natural language processing (NLP)
Text classification (spam detection, sentiment analysis, etc.)
Information retrieval (Search engines, question answering)
Name entity recognition (NER) (extracts names, dates, locations, etc. from text)
Machine translation (translates text from one language to another)
Speech processing
Microphone records very rapid air-pressure changes in the air
Takes as input audio, and outputs text
Trigger word detection (detects if a specific word is spoken, like "Alexa", "Hey Google", etc.)
Speaker ID, listens to someone speak and identifies who it is
Speech synthesis (text to speech, converts text to audio)
Generative AI
Creates high quality content, like images, text, audio, etc.
Input prompt, output content
Can create images, videos, text, audio, music, etc.
Robotics
Perception (figures out what is in the environment, based on sensor input data)
Motion planning (figures out how to move the robot, based on the perception data)
Control (executes the motion plan, and moves the robot)
General Machine learning
Unstructured data (images, audio, text, etc.)
Structured data (tabular data, like excel sheets, databases, etc.)
Unsupervised learning
Clustering
Price per packet vs No of packets sold
Detects purchase patterns in retail data
Groups similar items together, like customers, products, etc.
College kids purchase more energy drinks, and less coffee
Data is embedded in a high dimensional space, like price, quantity, location, etc.
Relationships between data points are constructed automatically, without any labels.
Can come up with new insights, like "customers who buy energy drinks also buy chips", or "customers who buy coffee also buy pastries".
Transfer learning
A model that is trained to detect cars with 100,000 images can be used to detect golf carts with 100 golf cart images.
Reinforcement learning
A drone leans to fly itself by trying different actions and getting feedback from the environment.
A pet dog learns to behave well by getting treats for good behavior and scolding for bad behavior.
Reinforcing good behavior and punishing bad behavior.
Uses a "reward signal" to tell when the AI is doing well or not.
Needs to re-iterate many times to learn the best actions. (We get a lot of data based on the training)
Generative adversarial networks (GANs)
Synthetic data generation
AI super models generation
Knowledge graph
A graph that represents knowledge in a structured way
Nodes represent entities, and edges represent relationships between entities
Supervised learning
Computer vision
Image classification (whole image is names) / Object recognition (parts of the image are named)
Facial recognition
Object detection, (finds position of objects in an image, and classifies them, draws a box around the object)
Image segmentation, (is this pixel part of a face? or a car? or a tree?, draws precise boundaries of objects in an image)
Tracking (follows objects in a video, like a car, or a person)
Natural language processing (NLP)
Text classification (spam detection, sentiment analysis, etc.)
Information retrieval (Search engines, question answering)
Name entity recognition (NER) (extracts names, dates, locations, etc. from text)
Machine translation (translates text from one language to another)
Speech processing
Microphone records very rapid air-pressure changes in the air
Takes as input audio, and outputs text
Trigger word detection (detects if a specific word is spoken, like "Alexa", "Hey Google", etc.)
Speaker ID, listens to someone speak and identifies who it is
Speech synthesis (text to speech, converts text to audio)
Generative AI
Creates high quality content, like images, text, audio, etc.
Input prompt, output content
Can create images, videos, text, audio, music, etc.
Robotics
Perception (figures out what is in the environment, based on sensor input data)
Motion planning (figures out how to move the robot, based on the perception data)
Control (executes the motion plan, and moves the robot)
General Machine learning
Unstructured data (images, audio, text, etc.)
Structured data (tabular data, like excel sheets, databases, etc.)
Unsupervised learning
Clustering
Price per packet vs No of packets sold
Detects purchase patterns in retail data
Groups similar items together, like customers, products, etc.
College kids purchase more energy drinks, and less coffee
Data is embedded in a high dimensional space, like price, quantity, location, etc.
Relationships between data points are constructed automatically, without any labels.
Can come up with new insights, like "customers who buy energy drinks also buy chips", or "customers who buy coffee also buy pastries".
Transfer learning
A model that is trained to detect cars with 100,000 images can be used to detect golf carts with 100 golf cart images.
Reinforcement learning
A drone leans to fly itself by trying different actions and getting feedback from the environment.
A pet dog learns to behave well by getting treats for good behavior and scolding for bad behavior.
Reinforcing good behavior and punishing bad behavior.
Uses a "reward signal" to tell when the AI is doing well or not.
Needs to re-iterate many times to learn the best actions. (We get a lot of data based on the training)
Generative adversarial networks (GANs)
Synthetic data generation
AI super models generation
Knowledge graph
A graph that represents knowledge in a structured way
Nodes represent entities, and edges represent relationships between entities
AI and Society
No title
AI and hype
We should neither be optimistic or pessimistic about AI.
AI is a very powerful tool, but it has its limitations. We can mitigate its potential harms and use it to create tremendous value.
Limitations of AI
Explainablity is hard (AI needs to explain why it made a certain decision)
Bias, (If an AI is trained on biased data, it will produce biased results)
Susceptible to Adversarial attacks
AI, developing economies and jobs
Bias
AI learning unhealthy stereotypes.
AI can be racist, sexist, and biased, from data.
This is because training data has more associations for men with programming than with women.
If a face recognition system is trained on a dataset that has more images of white faces than black faces, it will perform better on white faces than black faces.
Banks may suggest lower credit limits for black people than white people, even if they have the same credit score.
An resume screening AI may favor more men than women, if its training data is biased.
Reducing bias in AI systems is paramount.
Combating bias
Zero out the bias in the words (Lets say "White programmer" is associated with 0.8, and "Black programmer" is associated with 0.2, then we can zero out the bias by making both associations equal to 0.5, in the data space)
Use less biased data.
Use a more inclusive data. (Make sure most races are represented in the data)
Audit to figure out if the AI is biased.
Diverse workforce. Having more inclusive workforce, can help reduce bias in AI systems.
Adversarial attacks on AI
AI can be fooled to spit out sensitive information.
AI can classify a hummingbird as a hammer, by making minor perturbation (changes) to the image.
Physical attacks, like putting on a specific sticker on a stop sign, can make the AI think its a speed limit sign.
Putting on a certain type of glasses can make the AI think its a different person.
AI can be fooled to misclassify images, by adding noise to the image.
Defenses
Ongoing research.
Like a spam vs anti-spam, we may be in a arms race for some application.
AI generated video detector, to detect if a video is real or fake.
Adverse uses of AI
DeepFakes
Oppressive surveillance
Fake reviews / Fake comments (political bots)
Spam vs Anti Spam; Fraud vs Anti Fraud;
AI and developing economies
There will be less opportunities for low-skilled workers.
AI will automate away certain jobs, like data entry, customer support, etc.
AI and jobs
There is a lat of uncertainty about how AI will impact jobs.
AI will create more jobs than it will displace.
No title
AI and hype
We should neither be optimistic or pessimistic about AI.
AI is a very powerful tool, but it has its limitations. We can mitigate its potential harms and use it to create tremendous value.
Limitations of AI
Explainablity is hard (AI needs to explain why it made a certain decision)
Bias, (If an AI is trained on biased data, it will produce biased results)
Susceptible to Adversarial attacks
AI, developing economies and jobs
Bias
AI learning unhealthy stereotypes.
AI can be racist, sexist, and biased, from data.
This is because training data has more associations for men with programming than with women.
If a face recognition system is trained on a dataset that has more images of white faces than black faces, it will perform better on white faces than black faces.
Banks may suggest lower credit limits for black people than white people, even if they have the same credit score.
An resume screening AI may favor more men than women, if its training data is biased.
Reducing bias in AI systems is paramount.
Combating bias
Zero out the bias in the words (Lets say "White programmer" is associated with 0.8, and "Black programmer" is associated with 0.2, then we can zero out the bias by making both associations equal to 0.5, in the data space)
Use less biased data.
Use a more inclusive data. (Make sure most races are represented in the data)
Audit to figure out if the AI is biased.
Diverse workforce. Having more inclusive workforce, can help reduce bias in AI systems.
Adversarial attacks on AI
AI can be fooled to spit out sensitive information.
AI can classify a hummingbird as a hammer, by making minor perturbation (changes) to the image.
Physical attacks, like putting on a specific sticker on a stop sign, can make the AI think its a speed limit sign.
Putting on a certain type of glasses can make the AI think its a different person.
AI can be fooled to misclassify images, by adding noise to the image.
Defenses
Ongoing research.
Like a spam vs anti-spam, we may be in a arms race for some application.
AI generated video detector, to detect if a video is real or fake.
Adverse uses of AI
DeepFakes
Oppressive surveillance
Fake reviews / Fake comments (political bots)
Spam vs Anti Spam; Fraud vs Anti Fraud;
AI and developing economies
There will be less opportunities for low-skilled workers.
AI will automate away certain jobs, like data entry, customer support, etc.
What is the workflow of a data science project?
Optimizing a sales funnel
Collect data of user behavior on the website (visits, clicks, time spent, etc.)
Analyze the data to find insights "Overseas users leave when they find high shipping costs" "Spend fewer marketing dollars on overseas users"
Suggest hypotheses to improve the sales funnel "Reduce shipping costs for overseas users" Re-analyze the data to see if the changes had an impact
Optimizing the manufacturing line
Mix clay, shape mug, add glaze, fire kiln, final inspection
Collect data of the manufacturing process (temperature, pressure, time, etc.)
Analyze the data to find insights "High temperature leads to more defects" "Reduce temperature to reduce defects" "Because ambient temperature is warmer in the afternoon, we need to reduce the temperature in the afternoon"
Suggest hypotheses to improve the manufacturing process, or yield "Reduce temperature to reduce defects" Re-analyze the data to see if the changes had an impact
Every job function needs to learn how to use data
Data science can optimise the sales funnel, machine learning, can automate lead sorting.
Data science can help optimize the manufacturing process, machine learning can automate quality control.
Data science can help optimize the recruiting funnel, machine learning can automate resume screening. (Your system should be ethical and not biased)
Data science can help optimise the user-experience of a website, machine learning can automate content recommendations, can suggest push notifications, etc.
Data science can help suggest what to plant when and where, machine learning can detect where weeds are and help automate weeding.
How to choose an AI project?
No title
What AI can do (AI experts)
Valuable for your business (domain experts)
Select a project that's overlapping
Framework for brainstorming AI projects
Thinking about automating tasks rather than automating jobs.
What are the main drivers of business value?
What are the main pain points in your business?
You can make progress even without big data
Having more data almost never hurts.
Data makes some businesses defensible. (difficult for new players to come in)
Even with small datasets, you can still make progress.
Due diligence before starting an AI project
What can AI do?, Whats? valuable for your business?
Should overlap b/w the above steps.
Technical diligence
Can a AI system meet desired performance? (eg, 95% accuracy)
How much data is needed to achieve that performance?
Engineering timeline
Business diligence
Lowering costs
Increases revenue
Launch a new product or service
Ethical diligence
Does it make the society better?
Build vs Buy?
ML projects can be in-house or outsourced.
DS projects are more commonly in-house. (Its so closely tied to your business, it makes sense to keep it in-house)
Some things will be industry standard, don't reinvent the wheel. (Don't try to outrun a train)
Framework for brainstorming AI projects
Thinking about automating tasks rather than automating jobs.
What are the main drivers of business value?
What are the main pain points in your business?
You can make progress even without big data
Having more data almost never hurts.
Data makes some businesses defensible. (difficult for new players to come in)
Even with small datasets, you can still make progress.
Due diligence before starting an AI project
What can AI do?, Whats? valuable for your business?
Should overlap b/w the above steps.
Technical diligence
Can a AI system meet desired performance? (eg, 95% accuracy)
How much data is needed to achieve that performance?
Engineering timeline
Build vs Buy?
ML projects can be in-house or outsourced.
DS projects are more commonly in-house. (Its so closely tied to your business, it makes sense to keep it in-house)
Some things will be industry standard, don't reinvent the wheel. (Don't try to outrun a train)
Working with an AI team
Specify an acceptance criteria
Goal: detect defects in coffee mugs, in 95% of cases. (statistically, avg)
Provide AI team with a dataset to measure performance. (test set, doesn't have to be tool large, 1000-2000 samples is enough)
Data
Training set (ok, defect, used to train the model, and create A -> B mapping)
Test set (will not be used to train the model, used to measure performance of the model, should be representative of the real world data)
Don't expect 100% accuracy
Limitation of ML
Insufficient data
Mislabeled labels
Ambiguous labels
AI tools
ML Frameworks
PyTorch, TensorFlow, Hugging Farce, PaddlePaddle, Scikit-learn, R.
Reasearch Publications
Arxiv
Open source projects
Github
Smart speaker example
"Hey device, tell me a joke"
Trigger word detection, (input audio, output has_trigger_word_spoken)
Speech recognition, (input audio, output text)
Intent recognition, (input text, output intent (joke? time? music? call? weather?))
Execute action, (If its a joke, then get a joke from the database, and return it as text)
AI Pipeline
Rendering diagram...
"Hey device, set timer for 10 minutes"
Trigger word detection, (input audio, output has_trigger_word_spoken)
Speech recognition, (input audio, output text)
Intent recognition, (input text, output intent (set_timer?, timer_duration?))
Execute action, (If its a set_timer, then set the timer for timer_duration minutes)
Rendering diagram...
Self driving cars
Image / Radar / Lidar
Object detection, (car detection, pedestrian detection, traffic sign detection, etc.)
Lane detection, (detect the lanes on the road)
Outputs the position of the lanes.
Trajectory prediction, (predict where the detected objects will be in the future)
Outputs the predicted position and speed of the detected objects.
Motion planning, (how to move the car, based on the detected objects, without collisions)
Outputs the path and speed of the car.
Path should avoid obstacles, and follow traffic rules.
Steer / Accelerate / Brake
Rendering diagram...
Rendering diagram...
Roles and responsibilities in an AI team
Software Engineer
E.g., joke execution, timer execution, etc.
ML Engineer
Data gathering
Train a neural network
Test output
ML Researcher
Research new algorithms
Improve existing algorithms
Applied Learning Scientist
Somewhere b/w ML engineer and ML researcher
Data Scientist
Examine data and provide insights
Create dashboards, reports and presentations to team/executives.
Data Engineer
Organize data
Make sure data is saved securely, easily accessible and cost effectively.
AI product manager
Help decide what to build; whats feasible and valuable
You can start with a small team. You don't need a large team to start an AI project. Just you with a AI course and a dataset is enough to start.
AI transformation playbook
Execute small pilot projects to gain momentum
Can be in house or outsourced.
Show traction within 6/12 months.
Its more important for the first project to be successful, than to be big.
Build an in-house AI team
CEO, CAIO (Chief AI Officer)
AI team (central AI team)
Business unit 1
Business unit 2
Business unit 3 (gift card)
The central AI team, will be more like a consultancy, that helps the business units to implement AI in their projects.
They can help build company wise data infrastructures / platforms.
Better for AI team to have separate funding, rather than relying on the business units for funding.
Provide broad AI training
Executives and business leaders (What AI can do your enterprise, AI strategy, Resource allocation)
Pod leaders (Project direction, resource allocation, monitoring progress)
AI engineers (100hrs of training, Build and ship AI software, gather data, execute on specific AI projects)
Develop an AI strategy
Leverage AI to create an advantage specific to your industry sector.
Virtuous cycle of AI ( Better product --> More users --> More data --> Better product )
Consider creating a data strategy
Strategic data acquisition (Offer free services to collect data, Gmail)
Unified data warehouses (Collect data from all business units, and store it in a central place)
Create network effects and platform advantages
In industries where "winner takes all" is common, like social media, search engines, etc, AI can help accelerate the network effects.
Develop internal and external communications
Investor relations
Government relations (regulations, compliance, etc.)
Customer / user education
Talent / recruitment
Internal communications
AI pitfalls to avoid
Don't expect AI to solve all your problems, Be realistic about what AI can do.
Don't just hire 3/4 ML engineers and expect them to solve all your problems, You need a team with diverse skill sets, including data scientists, data engineers, software engineers, etc.
Don't expect AI projects to be successful in the first try, AI projects are iterative, you need to be prepared to fail and learn from your mistakes.
Traditional project planning doesn't work for AI projects, Work with your AI team to define the scope, timeline, and acceptance criteria for the project. AI KPIs are different from traditional software KPIs, you need to define them based on the AI project.
You don't need a superstar AI engineer to start an AI project, You can start with a small team, with online training.
Taking your first step in AI
Get friends to learn about AI
Start brainstorming projects
Hire a few ML/DS people to help
Hire or appoint an AI leader
Discuss with CEO/Board possibilities of AI transformation
Survey of major AI application areas
Supervised learning
Computer vision
Image classification (whole image is names) / Object recognition (parts of the image are named)
Facial recognition
Object detection, (finds position of objects in an image, and classifies them, draws a box around the object)
Image segmentation, (is this pixel part of a face? or a car? or a tree?, draws precise boundaries of objects in an image)
Tracking (follows objects in a video, like a car, or a person)
Natural language processing (NLP)
Text classification (spam detection, sentiment analysis, etc.)
Information retrieval (Search engines, question answering)
Name entity recognition (NER) (extracts names, dates, locations, etc. from text)
Machine translation (translates text from one language to another)
Speech processing
Microphone records very rapid air-pressure changes in the air
Takes as input audio, and outputs text
Trigger word detection (detects if a specific word is spoken, like "Alexa", "Hey Google", etc.)
Speaker ID, listens to someone speak and identifies who it is
Speech synthesis (text to speech, converts text to audio)
Generative AI
Creates high quality content, like images, text, audio, etc.
Input prompt, output content
Can create images, videos, text, audio, music, etc.
Robotics
Perception (figures out what is in the environment, based on sensor input data)
Motion planning (figures out how to move the robot, based on the perception data)
Control (executes the motion plan, and moves the robot)
General Machine learning
Unstructured data (images, audio, text, etc.)
Structured data (tabular data, like excel sheets, databases, etc.)
Unsupervised learning
Clustering
Price per packet vs No of packets sold
Detects purchase patterns in retail data
Groups similar items together, like customers, products, etc.
College kids purchase more energy drinks, and less coffee
Data is embedded in a high dimensional space, like price, quantity, location, etc.
Relationships between data points are constructed automatically, without any labels.
Can come up with new insights, like "customers who buy energy drinks also buy chips", or "customers who buy coffee also buy pastries".
Transfer learning
A model that is trained to detect cars with 100,000 images can be used to detect golf carts with 100 golf cart images.
Reinforcement learning
A drone leans to fly itself by trying different actions and getting feedback from the environment.
A pet dog learns to behave well by getting treats for good behavior and scolding for bad behavior.
Reinforcing good behavior and punishing bad behavior.
Uses a "reward signal" to tell when the AI is doing well or not.
Needs to re-iterate many times to learn the best actions. (We get a lot of data based on the training)
Generative adversarial networks (GANs)
Synthetic data generation
AI super models generation
Knowledge graph
A graph that represents knowledge in a structured way
Nodes represent entities, and edges represent relationships between entities
Supervised learning
Computer vision
Image classification (whole image is names) / Object recognition (parts of the image are named)
Facial recognition
Object detection, (finds position of objects in an image, and classifies them, draws a box around the object)
Image segmentation, (is this pixel part of a face? or a car? or a tree?, draws precise boundaries of objects in an image)
Tracking (follows objects in a video, like a car, or a person)
Natural language processing (NLP)
Text classification (spam detection, sentiment analysis, etc.)
Information retrieval (Search engines, question answering)
Name entity recognition (NER) (extracts names, dates, locations, etc. from text)
Machine translation (translates text from one language to another)
Speech processing
Microphone records very rapid air-pressure changes in the air
Takes as input audio, and outputs text
Trigger word detection (detects if a specific word is spoken, like "Alexa", "Hey Google", etc.)
Speaker ID, listens to someone speak and identifies who it is
Speech synthesis (text to speech, converts text to audio)
Generative AI
Creates high quality content, like images, text, audio, etc.
Input prompt, output content
Can create images, videos, text, audio, music, etc.
Robotics
Perception (figures out what is in the environment, based on sensor input data)
Motion planning (figures out how to move the robot, based on the perception data)
Control (executes the motion plan, and moves the robot)
General Machine learning
Unstructured data (images, audio, text, etc.)
Structured data (tabular data, like excel sheets, databases, etc.)
Unsupervised learning
Clustering
Price per packet vs No of packets sold
Detects purchase patterns in retail data
Groups similar items together, like customers, products, etc.
College kids purchase more energy drinks, and less coffee
Data is embedded in a high dimensional space, like price, quantity, location, etc.
Relationships between data points are constructed automatically, without any labels.
Can come up with new insights, like "customers who buy energy drinks also buy chips", or "customers who buy coffee also buy pastries".
Transfer learning
A model that is trained to detect cars with 100,000 images can be used to detect golf carts with 100 golf cart images.
Reinforcement learning
A drone leans to fly itself by trying different actions and getting feedback from the environment.
A pet dog learns to behave well by getting treats for good behavior and scolding for bad behavior.
Reinforcing good behavior and punishing bad behavior.
Uses a "reward signal" to tell when the AI is doing well or not.
Needs to re-iterate many times to learn the best actions. (We get a lot of data based on the training)
Generative adversarial networks (GANs)
Synthetic data generation
AI super models generation
Knowledge graph
A graph that represents knowledge in a structured way
Nodes represent entities, and edges represent relationships between entities
What is the workflow of a data science project?
Optimizing a sales funnel
Collect data of user behavior on the website (visits, clicks, time spent, etc.)
Analyze the data to find insights "Overseas users leave when they find high shipping costs" "Spend fewer marketing dollars on overseas users"
Suggest hypotheses to improve the sales funnel "Reduce shipping costs for overseas users" Re-analyze the data to see if the changes had an impact
Optimizing the manufacturing line
Mix clay, shape mug, add glaze, fire kiln, final inspection
Collect data of the manufacturing process (temperature, pressure, time, etc.)
Analyze the data to find insights "High temperature leads to more defects" "Reduce temperature to reduce defects" "Because ambient temperature is warmer in the afternoon, we need to reduce the temperature in the afternoon"
Suggest hypotheses to improve the manufacturing process, or yield "Reduce temperature to reduce defects" Re-analyze the data to see if the changes had an impact
Every job function needs to learn how to use data
Data science can optimise the sales funnel, machine learning, can automate lead sorting.
Data science can help optimize the manufacturing process, machine learning can automate quality control.
Data science can help optimize the recruiting funnel, machine learning can automate resume screening. (Your system should be ethical and not biased)
Data science can help optimise the user-experience of a website, machine learning can automate content recommendations, can suggest push notifications, etc.
Data science can help suggest what to plant when and where, machine learning can detect where weeds are and help automate weeding.
How to choose an AI project?
No title
What AI can do (AI experts)
Valuable for your business (domain experts)
Select a project that's overlapping
Framework for brainstorming AI projects
Thinking about automating tasks rather than automating jobs.
What are the main drivers of business value?
What are the main pain points in your business?
You can make progress even without big data
Having more data almost never hurts.
Data makes some businesses defensible. (difficult for new players to come in)
Even with small datasets, you can still make progress.
Due diligence before starting an AI project
What can AI do?, Whats? valuable for your business?
Should overlap b/w the above steps.
Technical diligence
Can a AI system meet desired performance? (eg, 95% accuracy)
How much data is needed to achieve that performance?
Engineering timeline
Business diligence
Lowering costs
Increases revenue
Launch a new product or service
Ethical diligence
Does it make the society better?
Build vs Buy?
ML projects can be in-house or outsourced.
DS projects are more commonly in-house. (Its so closely tied to your business, it makes sense to keep it in-house)
Some things will be industry standard, don't reinvent the wheel. (Don't try to outrun a train)
Framework for brainstorming AI projects
Thinking about automating tasks rather than automating jobs.
What are the main drivers of business value?
What are the main pain points in your business?
You can make progress even without big data
Having more data almost never hurts.
Data makes some businesses defensible. (difficult for new players to come in)
Even with small datasets, you can still make progress.
Due diligence before starting an AI project
What can AI do?, Whats? valuable for your business?
Should overlap b/w the above steps.
Technical diligence
Can a AI system meet desired performance? (eg, 95% accuracy)
How much data is needed to achieve that performance?
Engineering timeline
Build vs Buy?
ML projects can be in-house or outsourced.
DS projects are more commonly in-house. (Its so closely tied to your business, it makes sense to keep it in-house)
Some things will be industry standard, don't reinvent the wheel. (Don't try to outrun a train)
Working with an AI team
Specify an acceptance criteria
Goal: detect defects in coffee mugs, in 95% of cases. (statistically, avg)
Provide AI team with a dataset to measure performance. (test set, doesn't have to be tool large, 1000-2000 samples is enough)
Data
Training set (ok, defect, used to train the model, and create A -> B mapping)
Test set (will not be used to train the model, used to measure performance of the model, should be representative of the real world data)
Don't expect 100% accuracy
Limitation of ML
Insufficient data
Mislabeled labels
Ambiguous labels
AI tools
ML Frameworks
PyTorch, TensorFlow, Hugging Farce, PaddlePaddle, Scikit-learn, R.
Reasearch Publications
Arxiv
Open source projects
Github
Smart speaker example
"Hey device, tell me a joke"
Trigger word detection, (input audio, output has_trigger_word_spoken)
Speech recognition, (input audio, output text)
Intent recognition, (input text, output intent (joke? time? music? call? weather?))
Execute action, (If its a joke, then get a joke from the database, and return it as text)
AI Pipeline
Rendering diagram...
"Hey device, set timer for 10 minutes"
Trigger word detection, (input audio, output has_trigger_word_spoken)
Speech recognition, (input audio, output text)
Intent recognition, (input text, output intent (set_timer?, timer_duration?))
Execute action, (If its a set_timer, then set the timer for timer_duration minutes)
Rendering diagram...
Self driving cars
Image / Radar / Lidar
Object detection, (car detection, pedestrian detection, traffic sign detection, etc.)
Lane detection, (detect the lanes on the road)
Outputs the position of the lanes.
Trajectory prediction, (predict where the detected objects will be in the future)
Outputs the predicted position and speed of the detected objects.
Motion planning, (how to move the car, based on the detected objects, without collisions)
Outputs the path and speed of the car.
Path should avoid obstacles, and follow traffic rules.
Steer / Accelerate / Brake
Rendering diagram...
Rendering diagram...
Roles and responsibilities in an AI team
Software Engineer
E.g., joke execution, timer execution, etc.
ML Engineer
Data gathering
Train a neural network
Test output
ML Researcher
Research new algorithms
Improve existing algorithms
Applied Learning Scientist
Somewhere b/w ML engineer and ML researcher
Data Scientist
Examine data and provide insights
Create dashboards, reports and presentations to team/executives.
Data Engineer
Organize data
Make sure data is saved securely, easily accessible and cost effectively.
AI product manager
Help decide what to build; whats feasible and valuable
You can start with a small team. You don't need a large team to start an AI project. Just you with a AI course and a dataset is enough to start.
AI transformation playbook
Execute small pilot projects to gain momentum
Can be in house or outsourced.
Show traction within 6/12 months.
Its more important for the first project to be successful, than to be big.
Build an in-house AI team
CEO, CAIO (Chief AI Officer)
AI team (central AI team)
Business unit 1
Business unit 2
Business unit 3 (gift card)
The central AI team, will be more like a consultancy, that helps the business units to implement AI in their projects.
They can help build company wise data infrastructures / platforms.
Better for AI team to have separate funding, rather than relying on the business units for funding.
Provide broad AI training
Executives and business leaders (What AI can do your enterprise, AI strategy, Resource allocation)
Pod leaders (Project direction, resource allocation, monitoring progress)
AI engineers (100hrs of training, Build and ship AI software, gather data, execute on specific AI projects)
Develop an AI strategy
Leverage AI to create an advantage specific to your industry sector.
Virtuous cycle of AI ( Better product --> More users --> More data --> Better product )
Consider creating a data strategy
Strategic data acquisition (Offer free services to collect data, Gmail)
Unified data warehouses (Collect data from all business units, and store it in a central place)
Create network effects and platform advantages
In industries where "winner takes all" is common, like social media, search engines, etc, AI can help accelerate the network effects.
Develop internal and external communications
Investor relations
Government relations (regulations, compliance, etc.)
Customer / user education
Talent / recruitment
Internal communications
AI pitfalls to avoid
Don't expect AI to solve all your problems, Be realistic about what AI can do.
Don't just hire 3/4 ML engineers and expect them to solve all your problems, You need a team with diverse skill sets, including data scientists, data engineers, software engineers, etc.
Don't expect AI projects to be successful in the first try, AI projects are iterative, you need to be prepared to fail and learn from your mistakes.
Traditional project planning doesn't work for AI projects, Work with your AI team to define the scope, timeline, and acceptance criteria for the project. AI KPIs are different from traditional software KPIs, you need to define them based on the AI project.
You don't need a superstar AI engineer to start an AI project, You can start with a small team, with online training.
Taking your first step in AI
Get friends to learn about AI
Start brainstorming projects
Hire a few ML/DS people to help
Hire or appoint an AI leader
Discuss with CEO/Board possibilities of AI transformation
Survey of major AI application areas
Supervised learning
Computer vision
Image classification (whole image is names) / Object recognition (parts of the image are named)
Facial recognition
Object detection, (finds position of objects in an image, and classifies them, draws a box around the object)
Image segmentation, (is this pixel part of a face? or a car? or a tree?, draws precise boundaries of objects in an image)
Tracking (follows objects in a video, like a car, or a person)
Natural language processing (NLP)
Text classification (spam detection, sentiment analysis, etc.)
Information retrieval (Search engines, question answering)
Name entity recognition (NER) (extracts names, dates, locations, etc. from text)
Machine translation (translates text from one language to another)
Speech processing
Microphone records very rapid air-pressure changes in the air
Takes as input audio, and outputs text
Trigger word detection (detects if a specific word is spoken, like "Alexa", "Hey Google", etc.)
Speaker ID, listens to someone speak and identifies who it is
Speech synthesis (text to speech, converts text to audio)
Generative AI
Creates high quality content, like images, text, audio, etc.
Input prompt, output content
Can create images, videos, text, audio, music, etc.
Robotics
Perception (figures out what is in the environment, based on sensor input data)
Motion planning (figures out how to move the robot, based on the perception data)
Control (executes the motion plan, and moves the robot)
General Machine learning
Unstructured data (images, audio, text, etc.)
Structured data (tabular data, like excel sheets, databases, etc.)
Unsupervised learning
Clustering
Price per packet vs No of packets sold
Detects purchase patterns in retail data
Groups similar items together, like customers, products, etc.
College kids purchase more energy drinks, and less coffee
Data is embedded in a high dimensional space, like price, quantity, location, etc.
Relationships between data points are constructed automatically, without any labels.
Can come up with new insights, like "customers who buy energy drinks also buy chips", or "customers who buy coffee also buy pastries".
Transfer learning
A model that is trained to detect cars with 100,000 images can be used to detect golf carts with 100 golf cart images.
Reinforcement learning
A drone leans to fly itself by trying different actions and getting feedback from the environment.
A pet dog learns to behave well by getting treats for good behavior and scolding for bad behavior.
Reinforcing good behavior and punishing bad behavior.
Uses a "reward signal" to tell when the AI is doing well or not.
Needs to re-iterate many times to learn the best actions. (We get a lot of data based on the training)
Generative adversarial networks (GANs)
Synthetic data generation
AI super models generation
Knowledge graph
A graph that represents knowledge in a structured way
Nodes represent entities, and edges represent relationships between entities
Supervised learning
Computer vision
Image classification (whole image is names) / Object recognition (parts of the image are named)
Facial recognition
Object detection, (finds position of objects in an image, and classifies them, draws a box around the object)
Image segmentation, (is this pixel part of a face? or a car? or a tree?, draws precise boundaries of objects in an image)
Tracking (follows objects in a video, like a car, or a person)
Natural language processing (NLP)
Text classification (spam detection, sentiment analysis, etc.)
Information retrieval (Search engines, question answering)
Name entity recognition (NER) (extracts names, dates, locations, etc. from text)
Machine translation (translates text from one language to another)
Speech processing
Microphone records very rapid air-pressure changes in the air
Takes as input audio, and outputs text
Trigger word detection (detects if a specific word is spoken, like "Alexa", "Hey Google", etc.)
Speaker ID, listens to someone speak and identifies who it is
Speech synthesis (text to speech, converts text to audio)
Generative AI
Creates high quality content, like images, text, audio, etc.
Input prompt, output content
Can create images, videos, text, audio, music, etc.
Robotics
Perception (figures out what is in the environment, based on sensor input data)
Motion planning (figures out how to move the robot, based on the perception data)
Control (executes the motion plan, and moves the robot)
General Machine learning
Unstructured data (images, audio, text, etc.)
Structured data (tabular data, like excel sheets, databases, etc.)
Unsupervised learning
Clustering
Price per packet vs No of packets sold
Detects purchase patterns in retail data
Groups similar items together, like customers, products, etc.
College kids purchase more energy drinks, and less coffee
Data is embedded in a high dimensional space, like price, quantity, location, etc.
Relationships between data points are constructed automatically, without any labels.
Can come up with new insights, like "customers who buy energy drinks also buy chips", or "customers who buy coffee also buy pastries".
Transfer learning
A model that is trained to detect cars with 100,000 images can be used to detect golf carts with 100 golf cart images.
Reinforcement learning
A drone leans to fly itself by trying different actions and getting feedback from the environment.
A pet dog learns to behave well by getting treats for good behavior and scolding for bad behavior.
Reinforcing good behavior and punishing bad behavior.
Uses a "reward signal" to tell when the AI is doing well or not.
Needs to re-iterate many times to learn the best actions. (We get a lot of data based on the training)
Generative adversarial networks (GANs)
Synthetic data generation
AI super models generation
Knowledge graph
A graph that represents knowledge in a structured way
Nodes represent entities, and edges represent relationships between entities