Introduction to Machine Learning for SEO

Practical: Query Entity Extraction with Google NLP and Entity ML-enabled data analysis

This is a preview lesson

Purchase this course, or sign in if you’re already enrolled, to take this lesson.

Practical: Query Entity Extraction with Google NLP and Entity ML-enabled data analysis – Lesson Preview

Entity analysis turns messy keyword lists into structured, decision-ready data. In this practical lesson, you’ll extract entities from search queries with Google’s Natural Language API and then enrich, score, and visualize those entities to reveal opportunities at scale. It’s built for marketers and SEOs who want more than tags, they want to see the real-world concepts behind queries and how those concepts connect across an entire keyword universe.

You’ll start with two executable paths: a no-code Google Sheets template for quick runs and a Google Colab notebook for fast, scalable processing. From there, you’ll merge API output with your master keyword sheet (search volume, competitiveness, CPC, etc.) to quantify opportunity, build a custom prominence score that works across short queries, surface co-occurring n-grams, and map relationships between entities.

Finally, you’ll cluster semantically similar entities and optionally enrich priority entities via Google’s Knowledge Graph Search API, all with downloadable CSVs and visuals you can drop straight into content briefs and topical authority planning.


What you’ll learn (why it matters)

  • Extract query entities with Google NLP — because concepts > keywords for intent.
  • Scale via Colab + API quotas — because Sheets are slow on large sets.
  • Blend entity data with master metrics — because volume + relevance drives prioritization.
  • Build a dataset-level prominence score — because salience is weak on short queries.
  • Find top co-occurring n-grams — because phrasing guides briefs and on-page.
  • Graph and cluster entities — because relationships and proximity inform topical authority.

Key concepts (with mini-definitions)

  • Entity Extraction (NER) — supervised ML identifying real-world concepts in text.
  • Entity Type — category assigned by the API (person, org, event, etc.).
  • Salience/Prominence — importance of an entity within the analyzed text.
  • Entity Mentions & Variations — different surface forms mapped to the same entity.
  • Entity Sentiment — polarity/magnitude scores tied to specific entities.
  • TF-IDF — vectorizes text by weighting terms important to the dataset.
  • Cosine Similarity — measures semantic closeness between entity vectors.
  • Thresholding — filters weak links to keep only meaningful relationships.
  • K-Means Clustering — groups entities by semantic similarity.
  • PCA — reduces dimensions for 2D visualization while preserving structure.
  • NetworkX Graph — visual representation of entities and their connections.
  • Knowledge Graph Search API — returns IDs, descriptions, and related entities.

Tools mentioned

Google Natural Language API, Google Cloud (APIs & quotas), Google Colab, Google Sheets, Apps Script, Excel, TF-IDF, cosine similarity, NetworkX, K-Means, PCA, Plotly, Google Knowledge Graph Search API, Python and MLforSEO no-code template.


Practice & readings

  • Run the no-code Sheets template (add API key) to extract entities from a small keyword list.
  • Use the Google Colab to process ~4k–8k keywords, then merge output with your master sheet.
  • Read the MLforSEO blog posts on entity extraction and on mapping keywords to topics (SBERT/BERTOPIC/fuzzy matching are discussed there).

Key insights & takeaways

  • Salience is unreliable for short queries; use a dataset-level prominence score.
  • Programmatic extraction beats Sheets for speed, scale, and reliability.
  • Graphs + clusters expose related concepts and adjacent topics for expansion.
  • Combine automation with manual review to correct model blind spots.
  • Manage quotas/language errors and keep runs scoped to prioritized entities.

Ready for the next step? Start your learning journey with MLforSEO

Buy the course to unlock the full lesson
Save hours of manual analysis and turn thousands of keywords into actionable, entity-driven insights.

Length: 32 minutes|Difficulty: Standard
0 of 32 lessons complete (0%)