Combining Rule-based and Neural NLP systems to get the best of both worlds

Placeholder for the real post

Hello blog reader

Hello there!

Cruise itinerary planning using R, LPSolve

Now that nearly every company collects data about its operations, it’s possible for decisions to be made in a systematic data-informed way, minimizing the use of heuristics. This applies especially to an operations-centric enterprise who’s bottom-line is improved considerably if optimal solutions are found for difficult problems.

Many operations problems can be formulated as a Linear Programming problem, and R has a fantastic package – LPSolve to handle those, even with integer constraints (known as Mixed Integer Linear Programming or MILP). Let me illustrate how LPSolve can solve operations problems with a real example from the Cruise ship business.

How do cruise ship operators decide on the itinerary they should follow? Bahamas alone has about 30 ports. A 3-day cruise has to carefully select from those. What factors would they consider? Here are only a few –

1. Attractiveness of a port to tourists.

2. Profitability for the cruise operator.

3. Number of ports visited.

4.Time required for visiting the attractions in each port.

5. Congestion at the port, how many cruise ships can simultaneously dock?

6. Constraints on Port types (A 7-day cruise must include at least 2, and at most 4 ports with beaches)

Here is how you formulate the problem of maximizing customer satisfaction while maintaining competitive profitability and satisfying all of the other constraints in LPSolve –

https://github.com/pranavmodi/CruiseMILP

 

 

Book review: Safe Area Gorazde

Safe Area Gorazde is an account of a journalist/cartoonist visiting Bosnia during the breakup of Yugoslavia in 1992-1995. Gorazde then was a UN designated safe area surrounded by Serb controlled territory and was on the verge of obliteration many times during the war.

The book is beautifully illustrated. The most fascinating  drawings were not the ones depicting the destruction in Gorazde but its people. A book on the war in Bosnia without drawings would need many many more words. The emotions on the faces say more than the words.

I guess it is difficult to maintain a journalistic tone when you have been in such close proximity to the people for an extended time. Joe Sacco is fairly opinionated, but it works I think. The story is told mostly from a muslim point of view. It was shocking how the Serbs who were apparently at ease with  muslims  for generations right up to  the war, turned on their neighbors overnight – burnt their houses, even murdered them in many cases. I wish there was more exploration of that aspect in this book. The actions of the Serbs is portrayed as confounding, incomprehensible. 

The UN does not come out looking good here. Serbians violated the terms of the safe zone multiple times  and the UN acted only when the very existence of Gorazde was threatened.  The UN has to walk a very thin line  – for the concept of the UN to work, projecting neutrality is critical, otherwise it would be fighting wars regularly and there is no way for the member nations to make that work politically. Some of the implied criticism seems unfair too me.

The topic of book is serious, but the tone of the book is not uniformly serious. Joe Sacco pokes fun at himself for being treated like royalty by people undergoing such hardships. Americans are kept in surprisingly high regard in Bosnia;  it’s certainly distinct from the rest of Europe . Perhaps it was due to the war and things have changed now?

This was my first ‘comic book’, a friend gifted this in the hope of someone appreciating  graphic non-fiction. I did not know much about the ethnic conflict before reading the book.  It was shocking to learn that such atrocities were committed in Europe ! This was a great intro to the genre, but I think in future I’ll stick to less serious topics for my ‘comic books’.

 

Slicing, Selecting from pandas dataframes

I’m exploring Pandas DataFrame capabilities; It makes sense to devote some time in understanding the commonly used capabilities well.

These are the topics covered in this post –

1. Ways to create a dataframe – using a dict, using an ndarray

2. Creating a dataframe with indexes

3. Slicing and selecting rows filtering the data, selecting rows/columns according to criteria

4. Using groupby and apply on dataframes.

All the code is contained in an Ipython Notebook here . It’s a very convenient tool for illustration as it seamlessly combines code, text, plots and even supports latex markup.

Note: The code uses Pandas version 0.12 – the stable version the time of writing.