Saving a Twitter stream to RavenDB database using C#

In an earlier post  I explained how you can use C# to access a Twitter stream. In this post I will show you how to save the tweets from a Twitter stream to RavenDB.

The goal of this post is not to perform a deep dive into NOSQL databases or the Tweetinvi API. Instead its to get you up and running with the minimum of ceremony so you can start conducting your own experiments.

Raven DB is an open source NOSQL database for.NET which as my first experience of a NOSQL database I have found relatively straightforward to start experimenting with.

You can download RavenDB from here.  At the time of writing the stable release was 3.5.3 and I chose to use the installer which then proceeded to install RavenDB via the familiar wizard installation process.

 

 

 

 

 

 

 

 

 

 

 

Once installed you should have a folder structure similar to this:

If, like me you are new the world of NoSQL databases it is worth working your way through the Fundamentals tutorial. I found this an excellent introduction which I highly recommend.

To start RavenDB double click on the Start.cmd batch file in the root of the RavenDB directory. You should shortly see a new command window and a new tab of your default browser showing what databases you have. (which will be empty for the first time launch)

With RavenDB installed and running we can now start Visual Studio and create a new console application. I’ve called mine TrendingOnTwitterNoSQL

Using NuGet, add the following packages:

TweetinviAPI

RavenDB.Client

 

Navigate to Program.cs and add the following using statements:

using System;

using Raven.Client.Document;

using Tweetinvi;

Within the Main method add the following:

Auth.SetUserCredentials("CONSUMER_KEY", "CONSUMER_SECRET", "ACCESS_TOKEN", "ACCESS_TOKEN_SECRET");

Replace COMSUMER_KEY etc. with your Twitter API credentials. If you don’t yet have them. You can obtain them by going here and following the instructions.

Now add the following two lines:

  var stream = Stream.CreateFilteredStream();
  stream.AddTrack("CanadianGP");

The first line creates a filtered Twitter stream. A Twitter stream gives you the developer access to live information on Twitter. There are a number of different streams available. In this post we will be using one that returns information about a trending topic. More information about Twitter streams can be found in the Twitter docs and the TweetInvi docs.

At the time of writing, the Canadian Grand Prix was trending on Twitter which you can see in the second line.

The next step is to create a new class which will manage the  RavenDB document store.  Here is the complete code.

using System; 
using Raven.Client; 
using Raven.Client.Document; 

namespace TrendingOnTwitterNoSQL 
{ 
  class DocumentStoreHolder 
    { 
      private static readonly Lazy<IDocumentStore> LazyStore = 
          new Lazy<IDocumentStore>(() => 
          { 
            var store = new DocumentStore 
            { 
              Url = "http://localhost:8080", 
              DefaultDatabase = "CanadianGP" 
            }; 
            return store.Initialize(); 
           }); 
    
    public static IDocumentStore Store => LazyStore.Value; 
  } 
}

In the context of RavenDB, the Document Store holds the RavenDB URL, the default database etc. More information can be found about the Document Store in the tutorial.

According to the documentation for typical applications you normally need one document store hence the reason why the DocumentStoreHolder class is a Singleton.

The important thing to note in this class is the database URL and the name of the Default Database, CanadianGP. This is the name of the database that will store Tweets about the CanadianGP.

Returning to Program.cs add the following underneath stream.AddTrack to obtain a new document store:

  var documentStore = DocumentStoreHolder.Store;

The final class that needs to be created is called TwitterModel and is shown below

namespace TrendingOnTwitterNoSQL
{
  class TwitterModel
  {
    public long Id { get; set; }
    public string Tweet { get; set; }
  }
}

This class is will be used to save the Tweet information that the program is interested in, the Twitter ID and the Tweet.  The is a lot of other information that is available, but for the sake of brevity this example is only interested in the id and the tweet.

With this class created the final part of the code is shown below

using (BulkInsertOperation bulkInsert = documentStore.BulkInsert())
{
  stream.MatchingTweetReceived += (sender, theTweet) =>
  {
    Console.WriteLine(theTweet.Tweet.FullText);
    var tm = new TwitterModel
    {
      Id = theTweet.Tweet.Id,
      Tweet = theTweet.Tweet.FullText
    };

    bulkInsert.Store(tm);
  };
stream.StartStreamMatchingAllConditions();
}

As the tweets will be arriving in clusters, the RavenDB BulkInsert method is used. You can see this at line 1.

Once a matching Tweet is found, line 3, it is output to the console. Next a new TwitterModel object is created and its fields are assigned the Tweet Id and the Tweet Text. This object is then saved to the database.

The complete Program.cs should now look like:

using System;
using Raven.Client.Document;
using Tweetinvi;

namespace TrendingOnTwitterNoSQL
{
  class Program
  {
    static void Main(string[] args)
    {
      Auth.SetUserCredentials("CONSUMER_KEY", "CONSUMER_SECRET", "ACCESS_TOKEN", "ACCESS_TOKEN_SECRET");

      var stream = Stream.CreateFilteredStream();
      stream.AddTrack("CanadianGP");

      var documentStore = DocumentStoreHolder.Store;

      using (BulkInsertOperation bulkInsert = documentStore.BulkInsert())
      {
        stream.MatchingTweetReceived += (sender, theTweet) =>
        {
          Console.WriteLine(theTweet.Tweet.FullText);

          var tm = new TwitterModel
          {
            Id = theTweet.Tweet.Id,
            Tweet = theTweet.Tweet.FullText
          };

          bulkInsert.Store(tm);

       };
       stream.StartStreamMatchingAllConditions();
     }
   }
 }
}

After running this program for a short while you will have a number of Tweets saved. To view them, switch back to your browser, if not already on the RavenDB page navigate to http://localhost:8080 and click on the database that you created.

 

 

 

 

 

Selecting the relevant database you will then see the tweets.

 

 

 

 

 

 

 

Summary

In this post I have detailed the steps required to save a Twitter Stream of a topic of interest to a RavenDB.

A complete example is available on github

Acknowledgements

The genesis of this post came from the generous answers given to my question on StackOverflow.