636 words
3 minutes

What Kind of Data Does AI Use

By · Developer Advocate · Docker Captain · IBM Champion
Open silver MacBook Pro on a white desk showing a 'Work Hard Anywhere' mountain-coast wallpaper, with a small potted grass in a white planter on a gray notebook, a black smartphone, and a white Magic Mouse

Hey, you brilliant tech babes (and kings — I see you too 👀)!

You already know how AI learns. You know where it shows up in your day. There’s one last piece, though, and honestly it’s the big one:

It’s all about the data, darling.
Like, literally. No data = no AI.

So here’s the plan. We’ll walk through the types of data, why each one matters, and the kind of mess that happens when your data is a disaster (spoiler: everything goes sideways).

Grab your iced latte. Let’s talk dirty. Data dirty.

Structured vs. Unstructured — What’s the Tea?#

Structured Data = Neat girl energy#

Think spreadsheets, columns, drop-down menus. Basically, info that fits in tidy rows and is super easy to organize.

Real-life example:
Your online store tracks shoppers by name, age, and how much they spent last month.

  • Name: Emma
  • Age: 28
  • Purchase: $89.99
  • VIP status: Yes

AI eats this right up. Easy to analyze, easy to love. This is the go-to for supervised learning, because it’s simple, it’s predictable, and models just get it.

Unstructured Data = Total chaos… but gold#

Now the messy stuff. The pile AI has to decode:

  • Tweets
  • DMs
  • Selfies
  • Voice notes
  • TikToks
  • Review texts
  • Screenshots of what your friend sent you at 2am

Way harder to process. But this is where the juicy, human stuff lives.
Give AI the right tools (NLP, computer vision) and suddenly it can feel what people are saying, seeing, and reacting to. Wild, right?

The Data Types AI Actually Works With#

Here’s what shows up in real-world projects, laid out like a cute cheat sheet:

Data TypeLooks Like…ExampleWhat AI Does With It
NumericalNumbers, prices, scores5, 29.99, 90%Normalize, scale, compare
CategoricalLabels, options, categories”Beginner”, “NY”, “Tote bag”Convert to machine-friendly codes
TextReviews, messages, bios“Ugh, I loved it so much!”Understand mood, topic, emotion
ImageSelfies, product pics, screenshotsPhoto of a red dressDetect shapes, colors, objects
AudioVoice memos, customer callsVoice note: “Sooo… mad or just tired?”Turn into text or audio features
VideoInsta Reels, tutorials, dashcamsMakeup GRWM TikTokAnalyze image + sound together

⚙️ What Makes Data “Model-Ready”?#

You can’t just toss raw data into a model and pray. That’s like showing up to a glam shoot with bed hair and mismatched socks, sis.

Your data needs a serious prep routine first:

  1. Cleaning — remove duplicates, typos, missing stuff
  2. Formatting — make sure prices are numbers, dates are dates, and text is readable
  3. Labeling — if your model needs answers, you have to tell it what’s what (e.g. “This is a cat. That’s a croissant.”)
  4. Balancing — avoid over-representing one side (like 90% apples and 10% oranges — we’re not doing fruit bias today)

Why Bad Data = Bad AI#

Real talk. The smartest AI on earth won’t save your butt if the data is a mess.
Skip the prep and here’s your reward:

  • Your model makes wrong predictions
  • It becomes biased (like recommending only men’s products if it’s seen mostly male data)
  • It just… breaks. And blames you silently.

Example:

Say you want your AI to clock whether a voice note sounds angry or chill. But almost all your training data is people being excited. So someone calmly says “I’m fine,” and the model decides they’re thrilled. Or worse, lying.

Not AI’s fault, queen. That’s ✨data drama✨.

So… What Should You Know as a Beginner?#

You might not be building models yet. Doesn’t matter. Just knowing this puts you ahead:

  • Data is the fuel AI runs on
  • Clean, balanced data = accurate, smart AI
  • Unstructured data is harder, but so much more real and emotional
  • You can totally start working with it — one TikTok caption or product review at a time

Final Thoughts from a Data-Loving IT Girl#

People think AI is all about code and algorithms. But girl, no.
AI is 80% data, 20% logic — and 100% what you feed it.

Learning AI? Working in tech? Just curious how Netflix knows you want K-dramas at 2am? It always traces back to the data.

Clean it. Understand it. Respect it.

Because even your future AI model needs a solid skincare routine before it glows.


Tatiana Mikhaleva

Docker Captain  ·  IBM Champion  ·  AWS Community Builder

DevOps.Pink — cloud-native education for the agentic-AI era.

Related Posts

Same category
  1. 1
    Amazon Q - The AI DevOps Tool That Fixes AWS Headaches
    AI & MLOps · Amazon Q is AWS's AI assistant that helps DevOps engineers fix cloud issues faster with smart, context-aware insights and automation.
  2. 2
    Docker MCP - How GPT Agents Now Use Slack, GitHub, Stripe & More
    AI & MLOps · Learn how Docker and MCP let GPT agents use tools like Slack, GitHub, and Stripe — turning AI from smart talkers into real-world doers.
  3. 3
    How Generative AI Actually Understands You
    AI & MLOps · Discover how generative AI understands text, images, video, and sound — explained simply with real examples of tokens, chunks, and embeddings.
  4. 4
    How AI Models Are Really Trained - From Idea to Reality
    AI & MLOps · Learn how AI models are trained step by step — from data prep to deployment. Simple, beginner-friendly guide with real-life examples.

Random Posts

Random
  1. 1
    Kubernetes on Your Laptop — No Cloud, No Boring Docs, Just Magic
    DevOps & Cloud · Kubernetes tutorial for beginners. Learn to run a full local cluster with Docker, Minikube, YAML, Pods, Deployments, Secrets — no cloud needed.
  2. 2
    AI SRE Joined My On-Call — A Beginner-Friendly Walkthrough of Rootly
    DevOps & Cloud · What an AI SRE actually does on call. A hands-on walkthrough of Rootly — how it observes, advises, and (when you let it) acts. With a real look at the four-level trust model.
  3. 3
    What Actually Runs the Internet? A No-Stress Guide to Containers & Kubernetes
    DevOps & Cloud · Discover how containers, Docker, Kubernetes & ContainerD power modern apps — explained simply, even for total beginners.
  4. 4
    Mastering Terraform Tags Like a True IT Queen
    DevOps & Cloud · Learn how to master Terraform tags for cloud resource management, automation, and cost tracking. Discover best practices, default tags, and merging strategies!
What Kind of Data Does AI Use
https://devops.pink/what-kind-of-data-does-ai-use/
Author
Tatiana Mikhaleva
Published
2025-04-14
License
CC BY-NC-SA 4.0