Friday, July 14, 2017

Moore's Law is Dead

My current laptop is three years old, a then top-of-the-line Dell XPS 15. After three years, I felt I could use an upgrade, so naturally I went to to check out the latest and greatest XPS 15. The newer edition sports a nicer screen and the RAM is up-gradable to 32GB instead of 16, but the other specs are pretty similar.

Perhaps the most disappointing aspect of the newer XPS 15 laptop is the CPU. The new XPS 15 uses an Intel Core i7-7700HQ, whereas my three year old XPS 15 uses a i7-4702HQ. According to, the newer chip gets a speed score of 8987 whereas the older chip gets a speed score of 7523. That's less than a 20% improvement in speed after three years of R&D. Even worse, the newer chip uses a 45W of power whereas the older one used 37W, which is more than a 20% increase in power usage and battery depletion. Obviously, it's unclear whether we can rely on as a linear description of performance, but that is ostensibly the purpose of the benchmark.

I pretty saddened that after three years of developing the latest and greatest chips, it seems that Intel has only increased speed to the detriment of power usage.

Wednesday, October 24, 2012

Django + Google App Engine + MapReduce

If you're using Django-nonrel on Google App Engine, mapreduce will not work out of the box. I put a bit of work getting it running. Fortunately, I was not the first. This blog post suggests some code to get you started and allow you to run a mapper on all of our entities.  Unfortunately it only allows you to map app engine entities, not Django entities.  The code below fixes that issue. It works in a similar way, but performs a Django "get" before running the mapper to convert a key into a Django entity. This adds a bit more overhead; one more get per map.

class DjangoEntityInputReader(AbstractDatastoreInputReader):
 An input reader that takes a Django model ('app.models.Model') 
 and yields entities for that model
 def _iter_key_range(self, k_range):
   query = Query(util.for_name(self._entity_kind)
   raw_entity_kind = query.db_table
   query = k_range.make_ascending_datastore_query(
            raw_entity_kind, keys_only=True)
   for key in query.Run(config = datastore_query.QueryOptions(
      yield key, eval(self._entity_kind).objects.get(

 def _get_raw_entity_kind(cls, entity_kind):
   A bit of a hack, returns a table name based on entity kind.
   return entity_kind.replace(".models.","_").lower()

To use code above, you would place the above class in your and use the following in your mapreduce.yaml:

- name: My mapper


    input_reader: myapp.views.DjangoEntityInputReader

    handler: myapp.my_mapper

    - name: entity_kind
      default: myapp.models.MyModel

That's all you need to get mapreduce up and running, but there is an additional problem.  Mapreduce uses a property called "__scatter__" to scramble up the entities and assign them to a proper map reduce shard.  However, Django does not have the __scatter__ property, so what happens is that all of the entities get assigned to a single map reduce shard. You do not get to enjoy the massive parallelism of mapreduce. In order to make the change, you'll need some code of mine, which I posted here. Feel free to please contact me if you have any questions.

Sunday, September 30, 2012

PACER API with REST Interface Released

I had previously written a short blog entry on my open source PACER API. The open source project is ongoing, but I have recently devoted my efforts to Docket Alarm and its online PACER REST API, which is now substantially complete.

Docket Alarm's API allows users to search for docket information from Federal courts and pull the information using a simple REST interface.  The API has a wide variety of potential applications, especially for due diligence.  For example, an application that assists in originating loans could use the API to automatically look up a potential creditor's bankruptcy or litigation history.

The API can search by name, geographic location, date range and a number of other fields.  Additional fields can be added by request.  Once a search is complete, the API can access the case's docket text and associated meta-data. The meta-data contains fields like the judge's name, all of the party names, and the lawyers associated with each party. Finally, the API allows you to pull individual documents as PDFs.  Put together, it is a relatively complete set of features for a variety of applications.

The API only exposes a small subset of the features the features available on the greater website Docket Alarm.  If requested, additional features can be added.

The API specification is currently live and fully documented. Documentation is located here. If you are interested in using this feature, please let me know.

Tuesday, April 10, 2012

9th circuit rules that violating a website's terms of service is not criminal.

Monday, January 23, 2012

U.S. Courts PACER: An Accessible, Open-Source API

Get Access to All Information on the U.S. Courts Docketing System

Anyone who has tried to look up a court case on a government website has run into the Public Access to Court Electronic Records system, or as everyone calls it: PACER. I have developed and just released a new API, that gives programmers access to all public information on the U.S. Federal Courts docketing system.

Features include:
1. Search for cases by party name, docket number, and filing date.
2. Retrieve the names of parties to a case, their attorneys, and law firms.
3. Download the entire docket of a particular case.
4. Download pdfs of individual filings and their attachments.
5. Keep track of costs of each PACER transaction.
Right now, there are hooks into all Federal District Courts, most Appeals Courts, most Bankruptcy Courts and also the I.T.C. I am not aware of any other service or API which offers something similar for the I.T.C.

This project does not make PACER free. It still costs $0.08 per page (which can add up quickly). Although the API works perfectly as stand-alone python, it can plug into Django (or any other python framework) very easily. There are also hooks (and some meager documentation) to make it work on google app engine.

Also note that this project is released under the AGPL, a free and open-source license, but one which requires you to open-source your code if you use it in a program or a web-app.

The project can be found:

I am building a web-service which exposes a REST API to PACER and it will use this open-source API. If you are interested in learning more, let me know.

Tuesday, September 27, 2011

Studying hard? Check out my friend's flashcard app:

Wednesday, September 14, 2011

Found a mirror for android source despite (and being down: