Building a Semantic Recommendation Engine

Building a Semantic Recommendation Engine

Posted on August 08, 2017 0 Comments

MarkLogic’s triple store allows us to use custom inferencing rules, consumer profiles and dynamic behavior to build a simple, foundational, semantic recommendation engine. Imagine you’re a content provider and two consumers engage with your product. Both click “like” or rate the experience a 10, depending on how you collect feedback. You’re in the enviable position of having satisfied consumers and the challenging question is, “how do you create additional value for your business?” One way is to recommend targeted content to each consumer.

Chances are, the 20-year-old female student has given a superior rating for a reason that differs from the 55-year-old male veteran. If the young student liked the lead actor’s performance, her recommendation might be from a selection completely outside the genre but one that contains a similar actor in a similar role. Likewise, statistics show the older veteran might thrive on the topic, and would receive a recommendation within the genre but performed by an entirely different cast.

Let’s explore, by example, how MarkLogic’s semantic features can help with this challenge by creating a simple, foundational recommendation engine.

Here’s what we’ll need:

  1. An ontology
  2. Demographic information and dynamic behavior information about our consumers
  3. Custom inferencing rules
  4. A recommendation engine that merges the above

The remainder of this post will explore how each item in the list contributes to delivering a simple, targeted recommendation for each consumer.

An ontology

Figure 1 shows an ontology that describes our entities and properties. In it we define two types of consumers, three types of movies and two properties. One property, “likes,” is the dynamic behavior that drives our dynamic recommendation. The other property, “getsRecommendation,” is the response from our recommendation engine.

Figure 1: A Simple Ontology

In Query Console, run this code against your content database with Query Type “SPARQL Update” to insert the ontology:

## Run against content DB
PREFIX rdf:  <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX owl:  <http://www.w3.org/2002/07/owl#>
PREFIX rec:  <http://semantics-rec.com/facts#>

## insert  ontology data 
INSERT DATA 
{ 
  rec:Veteran rdf:type rdfs:Class .
  rec:Student rdf:type rdfs:Class .
  rec:Movie rdf:type rdfs:Class .
  rec:MilitaryMovie rdf:type rdfs:Class .
  rec:MilitaryActionMovie rdf:type rdfs:Class .
  rec:HeroMovie rdf:type rdfs:Class .

  rec:MilitaryActionMovie rdfs:subClassOf rec:MilitaryMovie .
  rec:HeroMovie rdfs:subClassOf rec:Movie .
  rec:HeroMovie rdfs:subClassOf rec:Movie .

  rec:getsRecommendation rdf:type rdf:Property .
  rec:likes rdf:type rdf:Property 
}

Demographic Information, Descriptive Metadata and Dynamic Behavior

Additional triples contain information about our consumers, our content’s descriptive metadata and the dynamic behavior of our consumers. In this case, the dynamic behavior is each consumer “likes” a movie. The following facts (triples) enter the database dynamically after each consumer clicks the like button.

BigJohn likes SavingPrivateRyan
YoungJane likes SavingPrivateRyan

Demographic information such as Big John’s veteran status and Young Jane’s student status is assumed to be collected as part of a company’s ongoing engagement with Big John and Young Jane as fans and subscribers.

Descriptive metadata, such as the movie’s genre, is standard title information and represents a small part of the growing set of metadata providers are creating about their content through manual and automated tagging initiatives.

Run the code below using Query Type “SPARQL Update” to add these triples to your database.

PREFIX rdf:  <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX owl:  <http://www.w3.org/2002/07/owl#>
PREFIX rec:  <http://semantics-rec.com/facts#>

INSERT DATA 
{ 
  rec:BigJohn rdf:type rec:Veteran .
  rec:YoungJane rdf:type rec:Student .

  rec:SavingPrivateRyan rdf:type rec:MilitaryActionMovie .
  rec:Platoon rdf:type rec:MilitaryMovie .
  rec:HacksawRidge rdf:type rec:MilitaryMovie .
  rec:BourneIdentity rdf:type rec:HeroMovie .
  rec:MissionImpossibleRogueNation rdf:type rec:HeroMovie .

  rec:BigJohn rec:likes rec:SavingPrivateRyan .
  rec:YoungJane rec:likes rec:SavingPrivateRyan 
}

Custom inferencing rules for each consumer

In the code below, we insert two custom inferencing rules, one for each consumer type. Each rule pertains to the dynamic behavior of “liking” a movie. This is where the rubber meets the road for creating targeted, custom recommendations or consumer experiences.

We can see that both rules capture the act of liking a MilitaryActionMove. However, because Big John is a Veteran, he’ll receive MilitaryMovie recommendations, whereas because Young Jane is a Student, she’ll receive HeroMovie recommendations.

These rules are no doubt overly generalized and simple. However, one could imagine including other algorithmic influencers such as age, gender, education level, income level, etc., which may be known about the consumer. SPARQL allows you to include these factors yet make them OPTIONAL if the information is not present in the consumer’s static profile or dynamic behavior.

Run the code below to insert the rules. Note that this code must be run against your Schemas database. It’s written in XQuery so that we can specify the document URI where the rules live, which lets us refer to them as a set later.

declareUpdate();

var textNode = new NodeBuilder();
textNode.addText(
  `
    # Incentive rules for inference
    PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
    PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
    PREFIX rec: <http://semantics-rec.com/facts#>

    rule "Gets Military Movie Recommendation" CONSTRUCT {
      ?s rec:getsRecommendation rec:MilitaryMovie
    } {
      ?s rdf:type rec:Veteran .
      ?o rdf:type rec:MilitaryActionMovie .
      ?s rec:likes ?o
    }

    rule "Gets Hero Movie Recommendation" CONSTRUCT {
      ?s rec:getsRecommendation rec:HeroMovie
    } {
      ?s rdf:type rec:Student .
      ?o rdf:type rec:MilitaryActionMovie .
      ?s rec:likes ?o
    }
  `
);

xdmp.documentInsert(
  '/rules/semantics-rec.rules',
  textNode.toNode()
);
xquery version "1.0-ml";

xdmp:document-insert(
  '/rules/semantics-rec.rules',
  text{
    '
    # Incentive rules for inference
    PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
    PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
    PREFIX rec: <http://semantics-rec.com/facts#>

    rule "Gets Military Movie Recommendation" CONSTRUCT {
      ?s rec:getsRecommendation rec:MilitaryMovie
    } {
      ?s rdf:type rec:Veteran .
      ?o rdf:type rec:MilitaryActionMovie .
      ?s rec:likes ?o
    }

    rule "Gets Hero Movie Recommendation" CONSTRUCT {
      ?s rec:getsRecommendation rec:HeroMovie
    } {
      ?s rdf:type rec:Student .
      ?o rdf:type rec:MilitaryActionMovie .
      ?s rec:likes ?o
    }
    '
  }
)

Recommendation engine

Finally, we come to the recommendation engine. The code below dynamically leverages the custom inferencing rules and determines the following:

  1. If the consumer is a “Student,” it will recommend “The Bourne Identity” and “Mission Impossible Rogue Nation” in response to a “like” of “Saving Private Ryan.”
  2. If the consumer is a “Veteran,” it will recommend “Hacksaw Ridge” and “Platoon” in response to a like of the same movie, “Saving Private Ryan.”
/*
 * Recommendation for Veteran Big John and Student Young Jane
 * Change Student to Veteran and vice-versa in WHERE clause
 * Run against content DB - RELIES ON semantics-mm.rules
 * Finds recommended movies for a Veteran or a Student 
 */
let rdfsStore = 
  sem.rulesetStore(("rdfs.rules","/rules/semantics-rec.rules"), sem.store() );
let userType = "Student" /* "Veteran" */;

/* use the store you just created - pass it into sem.sparql() */
sem.sparql(
  `
  PREFIX rdf:  <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
  PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
  PREFIX owl:  <http://www.w3.org/2002/07/owl#>
  PREFIX rec:  <http://semantics-rec.com/facts#>

  SELECT ?s ?r
  WHERE {
    ?r rdf:type ?o .
    ?s rec:getsRecommendation ?o .
    ?s rdf:type ?userType
  }`,
  {
    "userType": sem.iri("http://semantics-rec.com/facts#" + userType)
  },
  [],
  rdfsStore
)
(: 
 # Recommendation for Veteran Big John and Student Young Jane
 # Change Student to Veteran and vice-versa in WHERE clause
 # Run against content DB - RELIES ON semantics-mm.rules
 # Finds recommended movies for a Veteran or a Student 
:)
let $rdfs-store := 
  sem:ruleset-store(("rdfs.rules","/rules/semantics-rec.rules"), sem:store() )
let $user-type := "Student" (: "Veteran" :)
return
  (: use the store you just created - pass it into sem:sparql() :)
  sem:sparql(
    '
    PREFIX rdf:  <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
    PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
    PREFIX owl:  <http://www.w3.org/2002/07/owl#>
    PREFIX rec:  <http://semantics-rec.com/facts#>

    SELECT ?s ?r
    WHERE {
      ?r rdf:type ?o .
      ?s rec:getsRecommendation ?o .
      ?s rdf:type ?userType 
    }',
    map:new(map:entry("userType",
      sem:iri("http://semantics-rec.com/facts#" || $user-type))
    ),
    (),
    $rdfs-store
  )

Conclusion

Rapid change is occurring in Media & Entertainment. Industry players are forced to adapt. One imperative driving content providers is determining what to recommend to consumers who demand custom, rich experiences from an infinite shelf-space.

This post demonstrated how MarkLogic’s semantic features allows providers to combine descriptive metadata about their content with static user profiles and dynamic consumer behavior. The combination enables providers to build a simple, foundational recommendation engine that can be expanded for more precise and dynamic targeting. The goal is greater consumer loyalty, retention and opportunity from higher value consumer engagement.

Michael Malgeri

Michael Malgeri is a Principal Technologist with MarkLogic. He works with companies to match their business requirements with MarkLogic’s enterprise NoSQL database and semantic features. He helps organizations reduce costs, automate processes, find new opportunities and create applications that bring high value to businesses and their customers. Michael focuses on the media and entertainment industry, where content providers, distributors and related companies are seeking to leverage the power of data in order to capture new opportunities driven by expanding global information consumption.

Michael holds Master’s Degrees in Computer Science, Business and Mechanical Engineering. He's been a Certified Project Management Professional since 2011.

Comments

Comments are disabled in preview mode.
Topics

Sitefinity Training and Certification Now Available.

Let our experts teach you how to use Sitefinity's best-in-class features to deliver compelling digital experiences.

Learn More
Latest Stories
in Your Inbox

Subscribe to get all the news, info and tutorials you need to build better business apps and sites

Loading animation