Showing posts with label python. Show all posts
Showing posts with label python. Show all posts

October 17, 2023

How to add a Row to a Dataframe in Pandas

This is a triivial task that, ought not to warrant its own blogpost, but it took me almost 24 hours to figure out, so here we are: df.loc[len(df)] = np.nan

November 13, 2019

How to visualize a user's most Recent Subreddits

The above script lets you see which subreddits a given reddit user has been posting in over their last 100 submissions -- comments and posts. You'll need matplotlib, pandas, and requests to use it. Enjoy!

January 8, 2019

Python API Wrapper for CurrencyStack

In my never-ending quest to make the web digestible by machine, I encountered currencystack on ProductHunt. I noticed they have a JSON API. I tweeted them offering my services. A rate was agreed upon, and a few hours later, it was done. And I have another line on my CV.

July 14, 2017

Where exactly is Javier

On my way to LA, I received a text from a friend, that they wanted to meet for lunch at Toscana in Century City. Not being familiar with the Los Angeles area, I ended up writing a python script that would pop up the area around the location in OSM. I wanted a higher quality user-interface while keeping the script console-capable, so I could run it remotely. As the only terminal-based library distributed with python is Curses, I looked into using this and came up with this:

November 11, 2016

How to Import your Lyft Receipts into Excel

The Python code above goes into your IMAP-over-SSL inbox, gets Lyft receipts, parses them and puts out a report that can then be imported into Excel, R, or any other program that can read comma-separated values. I use this to classify my lyft usage as business or personal -- I can claim the business expense as part of my tax refund.

October 1, 2016

How to See Your Downvoted Comments on Reddit

The above python script lets you see a report of a comment's permalink and its score if the score is below 1. A demo from a random reddit account follows:

% python ./bin/redditScores.py -u meh613
https://www.reddit.com/r/asktrp/comments/53w05p/signs_a_woman_is_cheating/: 0
https://www.reddit.com/r/asktrp/comments/530p3r/how_do_i_play_this/: 0
https://www.reddit.com/r/dating_advice/comments/52ksoa/ladies_on_a_dating_site_how_do_you_tell_the/: -1
https://www.reddit.com/r/socialskills/comments/51fsak/my_social_life_is_great_but_my_love_life_is/: -3

July 14, 2016

How to Get Reddit's Karma Metadata

Karma refers to scoring comments on Reddit and the service does not sort the scores by date, just the comments themselves. The script above runs through and grabs all of your comments, keeps their scores and shows those comments whose scores have changed in the past 24 hours (86400 seconds). I do hope you find this useful and I look forward to your suggestions as to how to enhance it further.

April 25, 2016

How to Approximate Named-Entity Recognition

I just added named-entity recognition to the news summariser. Named-entity recognition is the identification of nouns in a piece. It further involves classifying these nouns. In this case, I've classified them into locations, people, and organisations (or rather, the Stanford NLP group has). It involved downloading named entity package, trimming the fat out of that and putting it into my filesystem. I then activated it in nltk and got some results. I would like a better training set, as all artificial intelligence of this sort rests on how accurate a training set you have. No demo, but you can download the linked script, and install the dependencies, and run it yourself to see how great (or horrible) the training set is. If you find a better one, please leave it in the comments. Thanks!

April 9, 2016

How to Deploy OTP

The gist above is an update on the previous post to implement one-time passwords. This one uses Flask along with 4-digit passcodes. It also uses units instead of random.org and has a 5-minute expiration, like paypal. The service is live for your use. I'd appreciate a comment if find this useful. Functionality demo:

February 15, 2016

How to download NLTK data

This morning, I was confronted with the task of provisioning a new nltk instance. With most python packages, this is no more than using pip and going about your day. Not so with nltk. In addition to pip install nltk, one must download data files to make it work. Of course, the code to download the data files is old, crufty and doesn't use best practices, as defined by yours truly. So, I rewrote it and came up with this (embedded above).

February 11, 2016

How to use Apache Kudu in Python

Yesterday, I walked over to yelp for the Dataswarm and ibis presentations. Wes McKinney mentioned a desire to have pandas read data directly from kudu. So I wrote the beginnings of integration over lunch today and have it up here to be ridiculed by you readers.

January 31, 2016

How to Track Expenses

My bank gives me email on every transaction. Because I use nmh for archiving mail, I can parse the messages to determine how much I've spent. The script above collects the reports from the mh folder I use for receipts and sums the amounts up, finally spitting out a locale-aware total amount in the inbox. And in 41 lines of python. Excellent... Future improvements include ripping out the access to messages and just fetching the amounts directly through Yodlee or similar. Watch this space.

January 18, 2016

How to Scrape r/dailyprogrammer

I greatly enjoy brainteasers. Thus, I follow the dailyprogrammer subreddit and do most of their challenges. The task is to mine and categorize the challenges. Formatting according to the spec is an exercise left to the reader, this will just get the posts into groups that isolate the necessary parts. The code above is a function that accomplishes the narrow task of isolating the date, the level, and the title of the challenge from the subreddit's rss feed utilising feedparser.

November 22, 2015

How to Search Github Sanely

I have a lot of open-source projects checked out on my machine. However, I am soon going to remove all of them that are hosted on Github. No, dear reader, I have not lost my mind. Rather, I've devised a way to search all of github for code and return results in JSON. How? Read on:

And the results look like this:

November 5, 2015

How to Do Unit Conversion

I just set up a web service (link to a demo) to convert units and spit out json. The source is given below:
Some notes on the code, it spits out JSON in an uncompressed, non-pretty-printed format. It calls the units application to do the conversion. If I were so inclined, I'd have it access this through its shared library with cython, but this was quick and dirty, taking me all of 10 minutes.

September 12, 2015

How to Log Activity from Your Apps

The code above implements the Logging server. I'm putting it up here such that others may use it and make suggestions on improvements. So, go ahead, rip it apart.

July 12, 2015

How to View HTML Messages in MH

Code above will let you take an nmh folder, save a message to a file, upload it to a remote server, and remove the message from the local mailbox to be viewed with a web browser. I use nmh for my work mailbox, where messages often come encrypted using PGP and therefore can't be viewed on webmail. No matter how many times I tell them, senders insist on HTML mail, so I had to get a workaround.

June 29, 2015

How to know what Endpoints any Werkzeug webapp Exposes

The code above lists your flask endpoints along with their handlers in json.

June 14, 2015

How to Authenticate Users

The code below is a simple authentication system written in python (of course) and bottlepy. It also includes a few, simple unit-tests:

For now, it only supports one site, I'll probably add multiple site support in the near future.

May 27, 2015

How to Remix My Shared Links

I share a lot of links -- by email, I tweet their aggregate statistics, I provide them in json, RSS, and compressed csv. Tonight, I added the option of getting them as well-formed, simple xml for Ibrahim. The entire method is 15 lines and embedded below: