I just added named-entity recognition to the news summariser. It involved downloading Stanford NER, trimming the fat out of that and putting it into my filesystem. I then activated it in nltk and got some results. I would like a better training set, as all artificial intelligence of this sort rests on how accurate a training set you have. No demo, but you can download the linked script, and install the dependencies, and run it yourself to see how horrible the training set is. If you find a better one, please leave it in the comments. Thanks!
April 25, 2016
April 16, 2016
I've been posting news summaries on a subreddit (of one) for testing purposes. In the past 30 minutes or so, I wrote the console script above to post the news summaries to one's console. Sample output follows:
April 9, 2016
The gist above is an update on the previous post to implement one-time passwords. This one uses Flask along with 4-digit passcodes. It also uses units instead of random.org and has a 5-minute expiration, like paypal. The service is live for your use. I'd appreciate a comment if find this useful. Functionality demo:
March 28, 2016
Recently, I've done a fair amount of work in java and its insanely huge class library. Like a TV sales pitch, but wait, there's more. However, given that this was never intended to be used by the world outside of academia and was, the good lords granted us a build system to alleviate long commands typed again and again. In keeping with its verbose design goals, every jar your project uses must be listed on its classpath, which became unwieldy, impractical and annoying. The solution to this were dependency management systems, like maven, ivy, gradle, or sbt.
But, while that solved the problem of compiling the project, it did not solve the problem completely. The good Java developers then came up with the concept of the fat jar, a jar with all dependencies included. And, again, every build tool had its own method (or methods) of achieving this. Ivy's is the most transparent, while maven hides the same thing in its shade plugin, which bolts itself onto the package, so as to try to be transparent. Gradle is similar to ivy in that there's a requirement of a custom task. And sbt also has an assembly plugin.
This is a mere outline of the state of java development infrastructure, circa Q1 2016. We are currently using all of these approaches in different projects, all of which are still supported. When I want to add, say, a better date handling class, how do I remember the specific syntax for the specific build tool? I could use go through links to figure it out, or I could use units to find it and copy the correct syntax for the build system straight into the build file and have it work once and for all time.
March 8, 2016
February 18, 2016
Ok, I exaggerated a tad, but with the latest units feature, you can get the current time corresponding to a phone number. Formats supported are +
February 15, 2016
This morning, I was confronted with the task of provisioning a new nltk instance. With most python packages, this is no more than using pip and going about your day. Not so with nltk. In addition to pip install nltk, one must download data files to make it work. Of course, the code to download the data files is old, crufty and doesn't use best practices, as defined by yours truly. So, I rewrote it and came up with this (embedded above).
February 11, 2016
Yesterday, I walked over to yelp for the Dataswarm and ibis presentations. Wes McKinney mentioned a desire to have pandas read data directly from kudu. So I wrote the beginnings of integration over lunch today and have it up here to be ridiculed by you readers.
February 6, 2016
February 4, 2016
Sometime yesterday afternoon, I started getting a reports of 404s on the units server. I couldn't look into this till this morning, but it turns out that the host it was on decided to commit ritual seppuku and wipe itself out. Fortunately, the code is backed up, but the config.py file is listed in .gitignore, which is where all my api keys were. I've trimmed a few services that weren't working and probably never will. Most of it is now up and running on a new box. If anything's broken, please let me know.