Archive for the ‘Tools, Frameworks & Best Practices’ Category.

vote for Strawberry Perl

At least when requiring XML::LibXML in a Windows environment, it was much easier to get running with Strawberry Perl than with ActiveState, mainly because Strawberry Perl has libxml and libxslt included with the install. Also I like using cpan better than ppm. ppm archives for perl 5.14 do not seem complete, and ActiveState will not give you an earlier perl in the community edition.

So, while I appreciate the contributions of both maintainers, the most painless path is the one to take…and in this case that was Strawberry Perl 5.14 with a cpan install of some other required modules (XML::LibXML was already installed).

Ctrl-Alt-Del for VirtualBox on MacOS

Ok, not a showstopper, but….running Windows in VirtualBox for MacOS, and had to press Ctrl-Alt-Del to start. There is an Alt key on my MacBookPro, but its fn-option and that didn’t work. The answer turned out to be under the VirtualBoxVM menu Machine->Insert Ctrl-Alt-Del. There might be other ways as well, but that was enough to get it going…

hadoop fs is space-sensitive

HDFS, high density file system, is useful for big data. However, hadoop fs is not quite there as a shell replacement. Today I kept getting the message

cp: When copying multiple files, destination should be a directory.

when trying to copy multiple files to a directory using

hadoop fs -cp /path/to/files/*  /path/to/destination/directory

Finally figured out that the problem was I had two spaces between the file list and the directory path, which made hadoop not see the directory path in the command. Aaahh.

don’t try creating gdbm file on an nfs mount

gems/gdbm-1.2/lib/gdbm.rb:256:in `initialize': Empty database (GDBMError)

error occurs when trying to use

g = GDBM.new('somefile')

on an nfs-mounted partition. GDBM works fine on normal drives, just don’t try it on nfs-mounts. Posting this as I found nothing when I googled the error message, and wasted several minutes before I realized the problem. The error message may be specific to the ruby ‘gdbm’ gem, but the rule is a general one.

Wordpress debug notes

Note: I’m not a wordpress expert, just returning to it after several years without having touched PHP – and looking for the best way to quickly understand the flow of a wordpress site using buddypress and a few other plugins. Raw notes here, will be annotated as I progress…

http://fuelyourcoding.com/simple-debugging-with-wordpress/

Data Visualization

The D3 javascript library looks awesome – clean, extensible, and powerful.

Checkout this example of mashing US Census boundaries with unemployment stats…

http://mbostock.github.com/d3/ex/choropleth.html

Disable spell-check in chrome

Spell checking and typeahead are two of my top gripes with modern software. URL or bookmark completion is ok, that is when I’m in the ‘trying to remember’ mode. But when I’m in the flow of writing, having the computer guess what I’m trying to say is incredibly distracting and annoying.

Originally, I thought gmail was running auto-spellcheck for me, but it was the browser, in this case Google Chrome.

In Chrome, you turn off spellcheck under chrome://settings/language ; uncheck the ‘Enable spell checking’ box underneath the list of languages.

(You can also get to this screen advanced settings screen by clicking on the wrench in the upper-right, select ‘Preferences’ and ‘Under the Hood’, then click ‘Languages and Spell-checker Settings’ )

Fix for CreateQualificationType returning 400 error using rturk

Update 7/14/11 – I believe this is fixed in rturk 2.4 – thanks Mark!
http://rubygems.org/gems/rturk
***
rturk, the Ruby gem for making calls to the Amazon Mechanical Turk API, uses a REST transport layer. That’s fine, but all calls are currently performed by a GET, which has a length limitation. When making calls that include long strings of data – such as the XML for a QuestionForm structure in a qualification tests – errors may occur with the non-explanatory message ‘400 Request Error’.

Was able to patch it by making a change to lib/rturk/requester.rb :

46,47c46,50
< RTurk.logger.debug "Sending request:\n\t #{credentials.host}?#{querystring}"
< RestClient.get("#{credentials.host}?#{querystring}")
---
> # RTurk.logger.debug “Sending request:\n\t #{credentials.host}?#{querystring}”
> # RestClient.get(”#{credentials.host}?#{querystring}”)
>
> RTurk.logger.debug “Posting request to #{credentials.host}:\n\t #{params.inspect}”
> RestClient.post(credentials.host.to_s, post_params)

A more robust fix might be to use POST only for longer requests, or make it an explicit option on the RTurk object

Workaround for corruption in saving bytea data thru rails

I’m not sure where the bug is, but when saving some binary data that was generated in a Rails3 before_create callback, it kept getting truncated in the actual INSERT INTO statement (though it appeared fine even in after_save callback, it was truncated in the database). Using PostgreSQL 8.4.7 with pg (0.9.0) and activerecord 3.0.5

Seemed like it could be related to this fixed bug, but my problem is on the save itself:
https://rails.lighthouseapp.com/projects/8994/tickets/611-cannot-write-certain-binary-data-to-postgresql-bytea-columns-in-2-1-0

In any case, found a simple workaround: uuencode the data first, and uudecode on loading.

   before_create do
      ... stuff that builds my_hash ...
      self.my_hash = Base64.encode64(Marshal.dump(my_hash))
   end

Then later, to reconstitute the hash,

   loaded_hash = Marshal.load(Base64.decode64(@record.my_hash))

Rails docs!

Found a real effort to doc the rails framework! Not all filled in, but a huge improvement over the tutorial-as-doc approach:

http://apidock.com/rails

ie link_to

Thanks APIDock!