Colt's Blog: 2011

Friday, July 15, 2011

New baby Frederickson - 20 weeks along!

Susie and I have been enjoying our pregnancy immensely. Susie has been very comfortable and we've both been feeling the baby kick a lot. Only very recently has Susie experienced any discomfort and it's very minor.

Today Susie and I had our 20 week ultrasound. The baby was very active and the radiologist says that everything looks great. We did not find out the sex of the baby, so it'll be a great little surprise. :)

I've attached the pictures we had taken during the ultrasound.

Colt

Friday, May 20, 2011

Facebook Authorization from C# -- OAuth2

This past week facebook announced plans to turn off the OAuth 1.0 authorized users starting in September. This means we need to bring the RightNow CX product up to their new OAuth 2.0 authorization flow. Based on the information provided to me I assumed this would be a pretty straight forward move and there shouldn't be many problems. I was only partially right.

Things that needed to be done:

Find a way to upgrade the users who have already authorized RightNow using OAuth 1.0
Move our application authorization to the new OAuth 2 flow.
Move our back end implementation to using the single access_token (and possibly some new apis)

Upgrading the tokens turned out to be super easy. Facebook provides a great little upgrade path using curl.

Upgrading the authorization flow seems easy enough. According to their documentation, I simply direct the application at a website and detect the redirects. Once it's all done, I have the token and they're "logged in". Of course I don't want to leave them logged in, so I want to log them out and save the token for future use. Facebook has always been notoriously bad at giving developers a good way to do this, so in our previous implementation I devised a reasonably clever method that takes advantage of HTTP's stateless nature. I simply grabbed the document object of the browser and cleared all the cookies for facebook.com. This meant that the browser literally could not remember who was logged in, so no matter how Facebook changed their pages it should still work. Come to find out, in the new authorization flow, this method does not work. At first I couldn't figure out what the problem was. It appeared to be logging the user out, but I could not log back in. It turns out there is a different type of cookie I had never heard of, called the http-only cookie.

"The HttpOnly cookie is supported by most modern browsers. On a supported browser, a HttpOnly cookie will only be used when transmitting HTTP (or HTTPS) requests. In addition, the cookie value is not available to client side script (such as Javascript), thereby mitigating the threat of cookie theft via Cross-Site-Scripting." via Wikipedia.

So, though I was clearing the session of all cookies I could see, I could not clear the HttpOnly cookies. So when the user went to log back in, it didn't look like there was anyone logged in, but the login was broken for the next user. I began to search around for some answers and decided it would be good to see how the C# SDK did it. I dug in just a bit only to find out that they are just directing the user to the logout page for mobile facebook, which logs the user out. There are some other suggestions here, but none of them will work for C# since the login was not done with Javascript. I'm personally appalled and scared of using this solution, but alas, it seems to be the only one available.

Moving our back end implementation was easy enough. I simply upgraded our php-sdk to the newest version and began using the api function to make the same calls we were using before. Since we have the access token I can skip the use of the session validation and start making api calls right from the get go.

All in all the conversion is going well, but I don't know how Facebook has managed to go this long without creating a proper way to programmatically "log out". It makes me sad to be using such an obscure (and seemingly fragile) way to log a user out of facebook. If you know of a better way, I'd be open to suggestions.

--Colt

Wednesday, May 11, 2011

New baby Frederickson is coming!

Yesterday Susie had another ultrasound and since we're past the magical "10 week mark" it's time to announce to the world that we're going to have a baby sometime around December 2. We are so excited that it's been hard to keep it a secret for so long!

Ultrasound at around 10 weeks.

It was such an amazing experience watching our little one move around in there. It's crazy how much you can see on ultrasounds these days. We got to see the baby's fingers and toes as well as see that little heart beating away. I need to start a betting pool on the sex of the baby, but that will come in due time.

--Colt

Thursday, March 31, 2011

Visualization: Pixels, Degrees and lots of data!

My visualization project has lead me down a rabbit hole I never knew existed. When I think of the world I think about the miles between location X and location Y, which can be easily translated into latitudes and longitudes. Though I never knew the actual conversion I knew it was mathematically trivial and I had never really thought about how this works in something like Google Maps.

Mapping Coordinates and Pixels
Of course when dealing with mapping software the world can no longer be represented as a ellipsoid. The standard way to project a globe onto a flat surface is the Mercator projection. Using this projection, we can display the globe as a flat map. There is some weirdness about the way the Mercator projection works, which you can read about here. Once the world has been projected into this form we can easily display the map, but Google maps this coordinate system onto yet another coordinate system, their tile system. This tile system is likely not a surprise to anyone who has ever used google maps, but the way that it works did surprise me a little bit. Google has a predefined tile size (which is 256 by standard). As you zoom the entire world is broken up into more and more tiles, but the viewport has the same number in it. For example, at zoom level 2 this is a 2x2 grid, but at the 19th zoom level it's 2097151 x 2097151. That's a lot of tiles!

Why does it matter?
Now the question is, where does this fit into my visualization project? What I need is the ability to map a lat/long (or group of lat/longs) into a particular tile at every zoom level. The initial scope of this project is to gather about a week's worth of data and allow the user to view this data at all zoom levels. The data is currently being gathered at about 6 tweets/second. 6 * 60 * 60 gives us 21,600 tweets per hour which is 3,628,800 (call it 4 million) for the week. The obvious (and bad) solution to this problem would be to simply create a database of the 4 million tweets at each zoom level and then when I wanted information for a particular tile (for a particular time) I would calculate the number of tweets in that tile/time, make a color out of that number and color the tile. Obviously this solution is very bad and we can do much better. The initial plan is to do a bunch of data preprocessing that will allow the data to grouped by both location and time, but that is a topic for another day.

--Colt

Thursday, March 24, 2011

Visualization Project and Google Maps

This semester I've been tasked with finding a reasonable project for my advanced graphics course. A small part of me wanted to use a basic physics engine, get some nice graphics and make my own version of angry birds or something, but I decided I'd go with something else. Maybe something to do with the social world...

I switched my plan to a visualization of individual tweets on top of something like Nasa World Wind, but as I began to investigate it, I found it very useless. A visualization of an individual tweet isn't nearly as cool as a visualization of many tweets over time, which brought me to my final idea (with the help of Tim)...

I think I'm going to do a visualization of twitter geographical data on top of google maps (or google earth). This way I'm aggregating the data before showing it to the user, which should be much easier to understand. My plan is as follows:

Divide the map up into a grid. For starters I will just do this on a single major city.
At different zoom levels the data will be regrouped. This way when a user is looking at the entire city it should show them a set of grids and as they zoom, the grids will become more detailed.
Provide (in tooltips or based on click events) details about the data in the grid they're selecting.
Provide some kind of coloring of each grid so that a user can tell where the most tweets are coming from.
Weight the tweets based on how old they are (e.g. a tweet from yesterday doesn't influence the coloring as much as a tweet from today).

This should provide an aggregation of data that shows where the most geo-coded tweets come from. The plan is to architect this in a way that if it works well that I could expand my search to include a larger area.

--Colt

Tuesday, March 22, 2011

Facebook, REST apis and latency

Last week I took some time to do some research about Facebook and the amount of time it would take to get the friend data I wanted. To get a baseline I used our internal development servers to get my list of friends. The problem was that I didn't have a good way to actually see what was coming across the wire, so I chose to use Fiddler to capture this information. My friend and co-worker Leif Wickland did a bit of hand holding to show me the ins and outs of this tool and we used it to reroute requests on our local network and capture the data that was coming across the wire.

Here's what we found out:

The current API we're using for Facebook supports XML and that it does not support JSON
The current API also does not allow gzip-encoding

Both of these things were a huge bummer, but since it's what we have to work with I continued on. I did some tests and found that about 500 friends was about 50KB of data. Now, this might not really sound like much in the world today, but it takes a whopping 550ms to complete. That's crazy for only 50KB of data! Now, considering that because of the way TCP works, this is 6 round trips and our ping to Facebook is about 70ms, 420 of that 550ms is wasted in TCP overhead. Now, there is nothing I can do about the TCP overhead (though it will be much better in our hosted environment) it's definately something I will keep in mind when dealing with our other REST apis.

Just to satisfy the curiosity of my mind I calculated that each friend was about 100 bytes of XML. If I were to convert to using the Facebook graph API and I switched to using JSON as my data format it would be more like 80 bytes per friend. While that's a nice savings it's even better if you then use the compressed json stream, which is only 15 bytes per friend.

I think it might be time to do some rewriting...

--Colt