Data Science 101: 9 Terms You Should Know
Before you can be a data scientist, you have to be able to speak like a data scientist.
Think of learning how to talk about data science like learning a new language. There are a few fundamental words to be learned and mastered. From there, it’s a matter of building out your data science vocabulary as you come across new ideas and concepts. Luckily, you can learn how to talk like a data scientist faster than you might have thought.
We go over nine terms that every beginning data scientist should know. To make each data science term more relatable, we provide a real-world example to show how the concept is used. Read through each example and answer the poll with your best guess at what each term means. The full list of definitions can be found at the end of this post.
Enjoy learning to talk data!
Real-world use: An autonomous vehicle that can learn to identify and react to driving obstacles in real-time.
Real-world use: An enterprise looking for reliable, accessible storage of petabytes of data that has no need for immediate analysis.
Real-world use: An ecommerce website that must process data associated with millions of users browsing for products, reading and writing reviews, and placing orders.
Real-world use: A handful of internal users at an enterprise looking to find the fastest growing product and the most profitable product categories.
Heteroscedasticity, Heteroscedastic data
Real-world use: A viral social media post that has garnered thousands of comments and millions of views in a short period of time.
Real-world use: An automobile manufacturer that wants to detect and alert drivers who may be fatigued.
Real-world use: An online retailer that wants to find and display the products to a user that have the highest chance of being bought.
Natural Language Processing
Real-world use: A digital assistant that identifies and presents relevant information based on your voice instructions.
Real-world use: A university that wants to create a dashboard to understand the factors—such as enrollment, course offerings, and class sizes—that impact student success.
Artificial intelligence – the ability of machines to use logic and advanced computation to learn and make decisions similar to humans
Data lake – a low-cost, reliable storage solution that’s accessible from anywhere
Database – a system where data can be stored, accessed, and transformed
Data warehouse – the place where companies mine data for business intelligence
Heteroscedasticity, heteroscedastic data – data that moves quickly and changes often
Image recognition – the use of computers to understand, identify and classify objects
Machine learning – an application of computer science to train machines using advanced models to understand patterns in data
Natural language processing – the use of computers to interpret and respond to spoken human language
Visualization – a method to understand the value of data visually and intuitively