Code | Data

Python Pipes for Text Tokenization

Tokenizing text is something that I tend to do pretty often, typically as the beginning of an NLP workflow. The normal workflow goes something like this: 1. Build a generator to stream in some sentences / documents / whatever. 2. Perform some set of transformations on

  • edh
Data

The Data Lake

I spend a lot of time thinking about how to deal with incoming data. I do this partially because it's my job, and partially because I have some kind of workflow OCD. I've been reading an increasing amount of market

  • edh
Life

On Moving from Academia to Industry

It has been a pretty hectic few weeks. I turned a year older, Maria and I visited Seattle for no good reason, I got a new job, and I told my current company that I'm leaving. Mainly I wanted to write about

  • edh

Test Post, Please Ignore

My first post, just kicking the tires to make sure everything is working appropriately. Code highlighting seems to be working fine. def say_hello(name): print('Hello {}!'.format(name)) Math is also fine. $$ \lambda x. \lambda y. P(x,y)$$ Tables are

  • edh
You've successfully subscribed to Null Pointer!