[[{"@type":["BlogPosting"],"@id":"https:\/\/www.schemaapp.com\/schema-markup\/the-4-steps-to-building-a-content-knowledge-graph\/#BlogPosting","@context":{"@vocab":"http:\/\/schema.org\/","kg":"http:\/\/g.co\/kg"},"url":"https:\/\/www.schemaapp.com\/schema-markup\/the-4-steps-to-building-a-content-knowledge-graph\/","publisher":[{"@id":"https:\/\/www.schemaapp.com\/#Organization"}],"audience":"https:\/\/schema.org\/PeopleAudience","inLanguage":[{"@type":"Language","@id":"https:\/\/www.schemaapp.com\/schema-markup\/the-4-steps-to-building-a-content-knowledge-graph\/#BlogPosting_inLanguage_Language","name":"English"}],"mentions":[{"@id":"https:\/\/www.schemaapp.com\/entity#Thing19"},{"@id":"https:\/\/www.schemaapp.com\/entity#Thing13"},{"@id":"https:\/\/www.schemaapp.com\/entity#Thing5"}],"dateModified":"2024-06-18T17:53:59+00:00","headline":"The 4 Steps to Building a Content Knowledge Graph","datePublished":"2024-04-03T17:12:28+00:00","image":[{"@type":"ImageObject","@id":"https:\/\/www.schemaapp.com\/schema-markup\/the-4-steps-to-building-a-content-knowledge-graph\/#BlogPosting_image_ImageObject","url":"https:\/\/www.schemaapp.com\/wp-content\/uploads\/2024\/04\/4-Steps-to-Building-a-Content-Knowledge-Graph-1.png"}],"mainEntityOfPage":"https:\/\/www.schemaapp.com\/schema-markup\/the-4-steps-to-building-a-content-knowledge-graph\/","name":"The 4 Steps to Building a Content Knowledge Graph","articleBody":"Knowledge graphs have been central to semantic technology for decades. From healthcare and eCommerce to fraud detection and SEO, knowledge graphs empower organizations to harness the full potential of their information architecture.\nBut even with a long history, knowledge graphs are more relevant than they\u2019ve ever been. According to Gartner\u2019s Emerging Tech Impact Report, a robust knowledge graph is imperative for organizations looking to implement generative AI technologies. Knowledge graphs can help organizations ground their AI initiatives\u2014like LLMs\u2014in factual data about the organization.\nIf you\u2019re interested in building a knowledge graph but are unsure where to start, you\u2019re in luck. The good news is that if you have a website, you can construct a reusable content knowledge graph that supports both SEO and your internal AI initiatives.\nThis article will take you through the four steps of building a content knowledge graph using the Schema.org vocabulary.\nWhy should you use Schema.org to build your content knowledge graph?\nYou can create a knowledge graph using any number of ontologies, vocabularies, or glossaries. However, Schema.org should be the vocabulary of choice for constructing a content knowledge graph since it allows you to simultaneously maximize the SEO benefits.\nHelp search engines clearly understand and contextualize the content on your web page\nThe Schema.org vocabulary was created by major search engines as an industry-standard vocabulary for translating human-readable web content into a language that machines understand. By using this vocabulary to construct a knowledge graph based on your content, you\u2019re also reaping the SEO benefits that come with it, including:\n\nEquipping search engines with an accurate understanding of your brand content\nFacilitating accurate and pertinent search queries that closely match your content\nDriving more targeted, engaged, and quality traffic to your site\n\nAchieve rich results and stand out in search\nBy annotating your web content with the required Schema.org types and properties, search engines like Google may award visually enhanced search features for content like Products, Videos, Recipes, and Ratings. These rich results present key information directly in the SERP, and can increase click-through rates and drive more engagement and quality traffic to your pages.\nBuilding Your Content Knowledge Graph\nSo you know about the SEO benefits of using Schema.org, but how does that get you a content knowledge graph? In the book Knowledge Graphs: Methodology, Tools and Selected Use Cases, Semantic Web and Knowledge Graph Experts, Fensel et al., break down the process of creating a knowledge graph into four steps:\n\nKnowledge Creation,\nKnowledge Hosting,\nKnowledge Curation, and\nKnowledge Deployment.\n\nWe\u2019ve applied an SEO lens to these steps to explain how you can create a robust content knowledge graph using your organization\u2019s web content. Let\u2019s get started.\nStep 1: Knowledge Creation\nThe first step to building a content knowledge graph is having high-quality, original content on your website and marking up that content using the Schema.org vocabulary.\nHave high-quality content on your website\nAs a general first rule, you need to ensure your website content supports the specific objectives of your organization. Whether you\u2019re selling good and services, educating the public, or wanting to build authority in a particular domain of expertise, the content across your website should exist to support those goals.\nBeyond this, Google has shared guidelines on what it deems \u201chelpful, reliable, people-first content.\u201d This is an excellent resource that provides a series of questions you can use to assess the quality of your content. For example, you\u2019ll want to ensure your content provides:\n\nOriginal information, reporting, research, or analysis\nA substantial, complete, or comprehensive description of the topic\nSubstantial value when compared to other pages in search results\n\nMarking up your content using the Schema.org vocabulary\nTo start building your content knowledge graph, you must annotate your high-quality content using types and properties from the Schema.org vocabulary. The annotations can be expressed in various formats, but Google recommends using JSON-LD. This translates the human-readable content on your website into machine-readable statements called RDF triples.\nInclude URIs in your Schema Markup to disambiguate your entities\nIn order to provide value beyond SEO, the entities in your Schema Markup must be represented by Uniform Resource Identifiers (URIs).\nIn JSON-LD, these identifiers appear as @ids to give the entities in your markup a unique identity that disambiguates and differentiates them from other entities \u2013 similar to how a social security number can uniquely differentiate people who may share the same name.\nWhile Schema Markup still provides SEO value without including @ids, they are a requirement for the markup to become a reusable knowledge graph.\nHow to Apply JSON-LD to Web Pages\nThere are a few options for implementing Schema Markup on your web pages. You can manually author the JSON-LD and insert it in your webpages\u2019 HTML, or you can use a plugin to generate and deploy the markup on your site.\nManual authoring requires technical expertise and isn\u2019t scalable if you have a large number of webpages, and while plugins that auto-generate markup are a scalable authoring solution, the markup is generally much less descriptive. If you want to customize your markup and ensure it is dynamic and connected, we recommend using the Schema App Highlighter to generate and deploy your markup at scale without having to do any manual coding.\nWhatever method you choose, the Schema Markup you author must appear in the HTML of the webpages being described, making it available for search engines and other web crawlers. In this state, your webpage content transforms into semantically enriched data, but this data doesn\u2019t truly become a knowledge graph until it\u2019s been collected and stored.\n<img fetchpriority=\"high\" decoding=\"async\" class=\"alignnone wp-image-14812 size-us_768_0\" src=\"https:\/\/ezk8caoodod.exactdn.com\/wp-content\/uploads\/2024\/04\/Content-to-JSON-LD-to-URI-768x256.png?strip=all&lossy=1&ssl=1\" alt=\"An image depicting the process of webpage content being transformed into JSON-LD, and then that JSON-LD being expressed as connected RDF triples.\" width=\"768\" height=\"256\" srcset=\"https:\/\/ezk8caoodod.exactdn.com\/wp-content\/uploads\/2024\/04\/Content-to-JSON-LD-to-URI-768x256.png?strip=all&amp;lossy=1&amp;ssl=1 768w, https:\/\/ezk8caoodod.exactdn.com\/wp-content\/uploads\/2024\/04\/Content-to-JSON-LD-to-URI-300x100.png?strip=all&amp;lossy=1&amp;ssl=1 300w, https:\/\/ezk8caoodod.exactdn.com\/wp-content\/uploads\/2024\/04\/Content-to-JSON-LD-to-URI-1024x341.png?strip=all&amp;lossy=1&amp;ssl=1 1024w, https:\/\/ezk8caoodod.exactdn.com\/wp-content\/uploads\/2024\/04\/Content-to-JSON-LD-to-URI-600x200.png?strip=all&amp;lossy=1&amp;ssl=1 600w, https:\/\/ezk8caoodod.exactdn.com\/wp-content\/uploads\/2024\/04\/Content-to-JSON-LD-to-URI-80x27.png?strip=all&amp;lossy=1&amp;ssl=1 80w, https:\/\/ezk8caoodod.exactdn.com\/wp-content\/uploads\/2024\/04\/Content-to-JSON-LD-to-URI-696x232.png?strip=all&amp;lossy=1&amp;ssl=1 696w, https:\/\/ezk8caoodod.exactdn.com\/wp-content\/uploads\/2024\/04\/Content-to-JSON-LD-to-URI-200x67.png?strip=all&amp;lossy=1&amp;ssl=1 200w, https:\/\/ezk8caoodod.exactdn.com\/wp-content\/uploads\/2024\/04\/Content-to-JSON-LD-to-URI-180x60.png?strip=all&amp;lossy=1&amp;ssl=1 180w, https:\/\/ezk8caoodod.exactdn.com\/wp-content\/uploads\/2024\/04\/Content-to-JSON-LD-to-URI-140x47.png?strip=all&amp;lossy=1&amp;ssl=1 140w, https:\/\/ezk8caoodod.exactdn.com\/wp-content\/uploads\/2024\/04\/Content-to-JSON-LD-to-URI-400x133.png?strip=all&amp;lossy=1&amp;ssl=1 400w, https:\/\/ezk8caoodod.exactdn.com\/wp-content\/uploads\/2024\/04\/Content-to-JSON-LD-to-URI-220x73.png?strip=all&amp;lossy=1&amp;ssl=1 220w, https:\/\/ezk8caoodod.exactdn.com\/wp-content\/uploads\/2024\/04\/Content-to-JSON-LD-to-URI-100x33.png?strip=all&amp;lossy=1&amp;ssl=1 100w, https:\/\/ezk8caoodod.exactdn.com\/wp-content\/uploads\/2024\/04\/Content-to-JSON-LD-to-URI-450x150.png?strip=all&amp;lossy=1&amp;ssl=1 450w, https:\/\/ezk8caoodod.exactdn.com\/wp-content\/uploads\/2024\/04\/Content-to-JSON-LD-to-URI-150x50.png?strip=all&amp;lossy=1&amp;ssl=1 150w, https:\/\/ezk8caoodod.exactdn.com\/wp-content\/uploads\/2024\/04\/Content-to-JSON-LD-to-URI-30x10.png?strip=all&amp;lossy=1&amp;ssl=1 30w, https:\/\/ezk8caoodod.exactdn.com\/wp-content\/uploads\/2024\/04\/Content-to-JSON-LD-to-URI-50x17.png?strip=all&amp;lossy=1&amp;ssl=1 50w, https:\/\/ezk8caoodod.exactdn.com\/wp-content\/uploads\/2024\/04\/Content-to-JSON-LD-to-URI-90x30.png?strip=all&amp;lossy=1&amp;ssl=1 90w, https:\/\/ezk8caoodod.exactdn.com\/wp-content\/uploads\/2024\/04\/Content-to-JSON-LD-to-URI-105x35.png?strip=all&amp;lossy=1&amp;ssl=1 105w, https:\/\/ezk8caoodod.exactdn.com\/wp-content\/uploads\/2024\/04\/Content-to-JSON-LD-to-URI.png?strip=all&amp;lossy=1&amp;ssl=1 1200w, https:\/\/ezk8caoodod.exactdn.com\/wp-content\/uploads\/2024\/04\/Content-to-JSON-LD-to-URI.png?strip=all&amp;lossy=1&amp;w=912&amp;ssl=1 912w, https:\/\/ezk8caoodod.exactdn.com\/wp-content\/uploads\/2024\/04\/Content-to-JSON-LD-to-URI.png?strip=all&amp;lossy=1&amp;w=1140&amp;ssl=1 1140w\" sizes=\"(max-width: 768px) 100vw, 768px\" \/>\nStep 2: Knowledge Hosting\nIn the hosting step, the Schema Markup you\u2019ve authored for your website must be collected and hosted in a way that allows the RDF data to be retrieved.\nCollecting the Schema Markup\nThere are two ways of collecting the Schema Markup once it has been applied to a website:\n1. Crawling: Where a crawler crawls a website, extracts the JSON-LD that has been applied, and stores it in a knowledge graph.\n2. Mapping: Many tools that map content to Schema.org will also store that markup in a knowledge graph.\nBut where does this storage occur?\nStoring Data\nBecause knowledge graphs are represented as RDF triples, the best place to store them for easy retrieval is an RDF database or triplestore. There are a variety of RDF stores available. Examples include:\n\nOpenLink Virtuoso\nOntotext GraphDB\nAmazon Neptune\nStardog\nAllegroGraph\n\nFor more information and to compare the various options, check out DB-engines.com. They rank the popularity of database management systems and provide helpful analysis.\nRetrieving Data\nYou can retrieve RDF data from a database or triplestore using SPARQL \u2013 an RDF query language. In the simplest terms, SPARQL uses known information to find unknown information (variables) using pattern matching.\nFor example, we could write a SPARQL query that says, \u201cFind all the people in my database who work for Schema App and know about semantic technology.\u201d \u201cMark van Berkel,\u201d our co-founder, would return as a match, and so would all other entities in our knowledge graph that match the same criteria.\nWhen you add Schema Markup to your website using Schema App\u2019s authoring tools, we host that data for you in our Knowledge Graph Data Platform. You can query your own graph using the SPARQL endpoint interface in your account. You can also use our Export Data API to export your knowledge graph for reuse in other contexts.\nOnce you have found an appropriate way to host your knowledge graph, you can move on to curation.\nStep 3: Knowledge Curation\nIt is a well-known fact that cleaning data is time consuming, and resource intensive, especially if you\u2019ve got a lot of it. That said, we will address 3 of the most important aspects of curating your data to ensure your high-quality web content has resulted in a high-quality content knowledge graph.\nIn the knowledge curation step, you should ensure that the data within your content knowledge graph is:\n\nAccessible\nCorrect\nComplete\n\nLet\u2019s break those down further.\nAccessible\nThe data in your knowledge graph needs to be available.\nFor example, when extracting your content knowledge graph from your website, you\u2019ll want to ensure that none of your web pages run into issues like \u201c404 not found\u201d errors. You will also want to ensure that the RDF store you\u2019ve selected for hosting keeps your data retrievable and secure.\nCorrect\nYour markup is free of syntax errors\nThe language used to express your knowledge graph can\u2019t have syntax errors like missing commas or brackets in the wrong places. Auto-generated markup from plugins or other authoring tools will prevent this from happening, but if you\u2019ve authored your markup manually, you\u2019ll need to take extra precautions. Syntax errors can be identified on a page-by-page basis by running your pages through the Schema Validator.\nThe markup must align with the content on the page\nIf you make content changes to your page without updating your markup, your knowledge graph will become out of date.\nAssessing whether the statements in your markup are correct and up-to-date can be difficult depending on the size of your dataset and how you manage your Schema Markup. This is especially true if you implement your markup manually. As previously mentioned, data cleanup is complex and resource-intensive, and becomes ever more so as your content grows and changes over time.\nTherefore, we recommend using a dynamic Schema Markup generator tool like the Schema App Highlighter to ensure your page\u2019s markup always aligns with its content and your RDF triples remain correct.\nThe markup follows the Schema.org vocabulary guidelines\nYou also need to ensure that your entity types use the most descriptive properties and that the properties used connect to expected types. For example, I can\u2019t say that a Person worksFor another Person, because Schema.org states that the worksFor property can only connect a Person to an Organization.\nThe types and relationships you apply to your data dictate what you\u2019re able to query for in your graph, and as a result also play an essential role in the completeness of your Knowledge Curation.\nComplete\nEnsure your knowledge graph contains enough data to answer queries relevant to your use cases. For example, if you want to know the correlation between ratings for products of specific sizes, colors, or prices, those properties must exist in your data.\nIn cases where your content references well-known entities (like brands, people, places, or concepts), you may also want to implement entity linking. Entity linking is a process that identifies entities in text and links them to corresponding known entities from external knowledge bases like Wikipedia, DBpedia, and Google\u2019s Knowledge Graph.\nYou can apply entity linking:\n\nManually for absolute precision\nAutomatically using Natural Language Processing APIs\n\nOnce embedded in your markup, these entities provide additional SEO value by helping search engines like Google disambiguate and contextualize your content to provide more accurate results for search queries. When it comes to your content knowledge graph, entity linking makes your knowledge graph more descriptive, providing an even richer data layer for you to reuse.\nStep 4: Knowledge Deployment\nThe knowledge deployment stage transforms the knowledge graph\u2019s theoretical structure into practical applications that drive tangible benefits for your organization and its stakeholders. In fact, I prefer to call this the \u201cReuse\u201d stage, since this is when you can finally reuse the knowledge graph you\u2019ve created for all sorts of different initiatives.\nTo reap the SEO benefits we\u2019ve previously described, you\u2019ll need to ensure you\u2019ve published your Schema Markup externally for search engines to consume.\nBeyond the SEO benefits, your content knowledge graph can be reused for things like enhancing user experience, content optimization, and AI and machine learning. Let\u2019s explore these opportunities further.\nEnhancing User Experience\nYou can utilize your content knowledge graph to improve website navigation and internal search functionality.\nFor example, if a user visits a product page on an eCommerce site for smartphones, the content knowledge graph can be leveraged for a recommendation engine to dynamically generate suggestions based on the products being viewed. This can appear as a \u201cYou May Also Like\u201d section or complementary products, like phone cases or chargers, suggested during checkout. This enhanced user experience can significantly increase engagement and conversion rates.\nContent Optimization\nYou can use your content knowledge graph to optimize existing content or identify gaps in your content.\nFor instance, your organization likely publishes blog posts on various topics. With a content knowledge graph, you can analyze the connections among entities in your blog posts. This analysis helps you pinpoint clusters of related topics or categories that have more coverage. If you notice gaps in the topics your organization wants to emphasize in their web presence, you can create additional content to fill those gaps.\nAI and Machine Learning Applications\nOrganizations can use knowledge graphs to accelerate their AI initiatives, including Chatbots and other LLM functions.\nKnowledge graphs provide a foundation for training AI and machine learning models for tasks such as natural language processing, recommendation systems, and predictive analytics. Knowledge graph data is already structured, making it easier for machines to process than unstructured content (natural language). This makes using AI less costly as use continues to scale.\nIf you\u2019re concerned about the risks of hallucinations from LLMs, you\u2019ll be happy to know that knowledge graphs can also be leveraged for Retrieval-Augmented Generation (RAG), resulting in more accurate answers to queries.\nThese are just some of the ways a content knowledge graph can support your organization in this rapidly changing technical landscape. And the best part is, you can easily construct one with the pre-existing data that constitutes your website.\nDeveloping a Content Knowledge Graph for Your Organization\nAlthough creating a content knowledge graph has only four steps, implementing these steps can be resource-intensive. However, with the numerous possibilities for reuse, building a content knowledge graph is a worthwhile investment that will yield a strong return as semantic search, AI, and knowledge management continue to evolve.\nAt Schema App, we can help you implement your Schema Markup data layer and develop a semantically enriched reusable content knowledge graph to prepare your organization for AI and support your semantic SEO efforts.\nGet in touch with our team to learn more.","description":"Learn the four steps of building a content knowledge graph using the Schema.org vocabulary to support your SEO and AI initiatives."},{"@context":"http:\/\/schema.org","@type":"Thing","name":"Knowledge Graph","sameAs":["https:\/\/en.wikipedia.org\/wiki\/Knowledge_graph","http:\/\/www.wikidata.org\/entity\/Q33002955","kg:\/g\/11jtynfm6d"],"description":"information repository structured as a graph","@id":"https:\/\/www.schemaapp.com\/entity#Thing13"},{"@context":"http:\/\/schema.org","@type":"Thing","sameAs":["http:\/\/www.wikidata.org\/entity\/Q180711","kg:\/m\/019qb_","https:\/\/en.wikipedia.org\/wiki\/Search_engine_optimization"],"name":"SEO","alternateName":"search engine optimization","description":"practice and strategies of increasing online visibility in search engine results pages","@id":"https:\/\/www.schemaapp.com\/entity#Thing19"},{"@context":"http:\/\/schema.org","@type":"Organization","address":{"@type":"PostalAddress","streetAddress":"201 - 412 Laird Road","postalCode":"N1G 3X7","addressRegion":"Ontario","addressLocality":"Guelph","addressCountry":"https:\/\/www.schemaapp.com\/#Country","name":"Schema App Address","@id":"https:\/\/www.schemaapp.com\/#PostalAddress"},"logo":{"@type":"ImageObject","width":"290","height":"93","url":"https:\/\/ezk8caoodod.exactdn.com\/wp-content\/uploads\/2020\/07\/SA_Logo_Main_Orange_w300-1.png?strip=all&lossy=1&ssl=1","@id":"https:\/\/ezk8caoodod.exactdn.com\/wp-content\/uploads\/2020\/07\/SA_Logo_Main_Orange_w300-1.png?strip=all&lossy=1&ssl=1"},"potentialAction":{"@type":"ScheduleAction","name":"Schedule a Demo","url":"https:\/\/www.schemaapp.com\/book-a-demo\/","@id":"https:\/\/www.schemaapp.com\/#ScheduleAction"},"image":{"@type":"ImageObject","width":"1350","height":"650","url":"https:\/\/ezk8caoodod.exactdn.com\/wp-content\/uploads\/2021\/04\/Schema-App-Featured-Image.png?strip=all&lossy=1&ssl=1","@id":"https:\/\/ezk8caoodod.exactdn.com\/wp-content\/uploads\/2021\/04\/Schema-App-Featured-Image.png?strip=all&lossy=1&ssl=1"},"description":"Schema App is an end-to-end Schema Markup solution that helps enterprise SEO teams develop a knowledge graph and drive search performance.","knowsAbout":["http:\/\/www.wikidata.org\/entity\/Q1891170","https:\/\/www.wikidata.org\/wiki\/Q6108942","https:\/\/www.wikidata.org\/wiki\/Q26813700","https:\/\/www.wikidata.org\/wiki\/Q180711","http:\/\/www.wikidata.org\/entity\/Q33002955"],"keywords":["Structured Data","Knowledge Graph","Rich Results","Semantic Search","Search Engine Optimization","Schema Markup","Semantic Technology"],"location":"http:\/\/www.wikidata.org\/entity\/Q504114","sameAs":["https:\/\/www.instagram.com\/lifeatschemaapp\/","https:\/\/www.linkedin.com\/company\/2480720\/","https:\/\/twitter.com\/schemaapptool","https:\/\/www.youtube.com\/channel\/UCqVBXnwZ3YNf2BVP1jXcp6Q"],"legalName":"Hunch Manifest Inc","name":"Schema App","telephone":"+18554448624","url":"https:\/\/www.schemaapp.com\/","email":"support@schemaapp.com","knowsLanguage":"http:\/\/www.wikidata.org\/entity\/Q1860","areaServed":"http:\/\/www.wikidata.org\/entity\/Q13780930","@id":"https:\/\/www.schemaapp.com\/#Organization"},{"@context":"http:\/\/schema.org","@type":"Thing","name":"AI","sameAs":["http:\/\/www.wikidata.org\/entity\/Q11660","kg:\/m\/0mkz","https:\/\/en.wikipedia.org\/wiki\/Artificial_intelligence"],"description":"field of computer science that develops and studies intelligent machines","@id":"https:\/\/www.schemaapp.com\/entity#Thing5"}],{"@context":"https:\/\/schema.org\/","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"name":"Schema Markup","item":"https:\/\/www.schemaapp.com\/schema-markup\/#breadcrumbitem"},{"@type":"ListItem","position":2,"name":"The 4 Steps to Building a Content Knowledge Graph","item":"https:\/\/www.schemaapp.com\/schema-markup\/the-4-steps-to-building-a-content-knowledge-graph\/#breadcrumbitem"}]}]