Getting your data – data sources run-through – Lesson Preview
This practical lesson shows Marketing and SEO professionals how to build a robust keyword universe, the backbone for every semantic keyword research project. You’ll revisit the difference between traditional and semantic keyword research, then map out the first, critical step: collecting queries from diverse sources so later analysis (clustering, journey mapping, topical maps, content briefs) is trustworthy.
You’ll learn which sources are “must-have,” what each contributes, and where APIs beat no-code tools for scale and cost. Lazarina stresses that data in = data out: richer inputs lead to better insights. You’ll see how to combine ranked terms from Search Console, competitor datasets (business vs. organic), People Also Ask expansions, autosuggest outputs, trend signals, and your own pattern-based keywords. There’s guidance on identifying organic competitors via SERP checks and on when to harvest their libraries versus only the URLs that win for your terms.
By the end, you’ll have a clear blueprint to compile, merge, and enrich large exports (in Sheets, BigQuery, or Python), so you can move confidently into the analysis phase knowing your foundations are solid.
What you’ll learn (why it matters)
- Build a complete keyword universe — because later analysis depends on coverage.
- Combine data sources effectively — because single-source lists miss intent and opportunity.
- Separate business vs. organic competitors — because strategy and partnerships differ.
- Leverage autosuggest & PAA at scale — because they expose next-step user journeys.
- Add pattern-based keywords (EAV) — because programmatic patterns reveal systematic gaps.
- Use APIs for scale and cost control — because large projects outgrow manual/no-code flows.
Key concepts (with mini-definitions)
- Semantic keyword research — collecting and relating queries using meaning, entities, and intent.
- Keyword universe — the consolidated, de-duplicated master list of all candidate queries.
- Business vs. organic competitors — revenue competitors vs. traffic competitors (e.g., affiliates, creators).
- People Also Ask (PAA) — question expansions showing common follow-ups and answers.
- Pattern-based keywords — terms generated from recurring query templates (brand/product/entity patterns).
- EAV model — using Entities, Attributes, and Values to discover and complete keyword patterns.
- Content gap analysis — comparing your coverage vs. competitors to find missed terms.
- SERP analysis for competitors — using top terms to identify who appears most often and in which formats.
- Trending keywords — rising topics from trend tools to keep the universe fresh.
Tools mentioned
Google Search Console, Search Analytics for Sheets, Google Sheets, BigQuery, Python, SEMrush, Ahrefs, SE Ranking, Google Auto Suggest, Pemavor autocomplete keyword tool, data for SEO, Data4SEO, Google APIs (Autosuggest), Google Colab script (Autosuggest + clustering), Google Maps Platform (Places API: Query Autocomplete, Place Autocomplete), Keywords Everywhere, AlsoAsked, Keywords People Use, AnswerThePublic, Backlinko’s tool, SEO Minion (Chrome extension), Appify, Google Trends, Google Trends API, PyTrends (unofficial), Exploding Topics, Reddit, YouTube, TikTok, Google Ads Keyword Planner, TikTok Ads, YouTube Ads, Internal site search and Google Cloud Natural Language API (Entity Analysis).
Practice & readings
- Use the provided Google Colab script to expand 100 seed keywords with Autosuggest and cluster outputs.
- Merge multiple competitor exports (Excel/CSV) with the shared Python script; add a competitor name column from filenames or ranked URLs.
- Follow the practical lesson + blog walkthrough on using Google Maps Places API (Query/Place Autocomplete) to pull local keyword ideas.
Key insights & takeaways
- Data in = data out: invest in collection or your analysis will suffer.
- Blend sources: autosuggest, PAA, trends, and competitors surface different intent layers.
- APIs scale: for volume projects, APIs are cheaper, cleaner, more repeatable than manual pulls.
- Differentiate competitors: partner with organic players; out-compete business rivals.
- Program patterns: EAV and templated queries unlock systematic, high-coverage lists.
Ready for the next step? Start your learning journey with MLforSEO
Buy the course to unlock the full lesson
Build a richer, smarter, keyword universe that powers every SEO decision you make.
