Never assume no one reads the comments in your code

So, a few years back a client of mine was very excited about their recent uptick in web traffic, though it wasn’t very clear why.  The client had just finished a complete redesign of their website, so at first we assumed it was the new site and all the SEO kungfu of the hotshot designer they had hired.

It had been a few months and the designer was not responding to repeated emails, so I decided to help them out.  But after digging a little deeper, it became even more confusing. The main keyword for all the traffic was completely unrelated to the client or their industry. After researching the keywords, I realized it was another local company. And then it clicked…

In researching the problem, I found a recurring potential cause in forums: this often happens when designers plagiarize another site. So I looked at the source for the other local company’s web site and sure enough, they were using my client’s Google Analytics code.  Just the home page though… all other pages had, what I assume, was their own code.

Still no response from the designer, so I called the company with the copied home page.  But they act like I’m crazy, have no idea what I’m talking about and finally ditch me into some sales voicemail that never called me back. I filled out their contact-us web form and even tried emailing postmaster@… but no response.

Why would they plagiarize my clients site? It looks and functions nothing like it. It finally occurred to me that it was the hotshot, SEO optimised, cloud based, paradigm shifting web designer that screwed it up (lets call him [hotshot] to protect the innocent).  By this time, he was already persona non grata to my client for reasons they never explained… though it was about to become clear.

While I was comparing the source of the two sites, I caught something I missed the first time. Just happened to notice this little nugget of comment in the header:

 This site was built on the cheap and does not include a lot of great programing. In particular, the global navigation system could/should have used jquery for a cleaner more, more professional codebase.
 A jquery version was built (for fun), but was not implemented because this hack worked and because it was done on the developer's own time.

The analytics code issue was funny… but this just made my day. Forwarded it to a few of my friends, had a good chuckle and forgot about it for a while.

With no response from their sysadmin, I gave up and decided to focus on my clients account. So I set up some filters to keep this imposter site from tainting our analytics. Much to my clients shagrin, their traffic was no longer so exciting.

A few months later, I tried their contact-us form again just for kicks.

You are still using the wrong Analytics tracking code in your home page. Please search your source code for UA-2#####4-1 and replace it with UA-6#####1-1. We have managed to filter out the results from your domain. But, I thought you would want to know.

p.s. never assume no one reads the comments in the source code. :)

To my surprise, I got this the next day:

Thank you for informing us of the incorrect tracking code. I really appreciate the filtering of results of our site and I am now in the process of changing that code. We had originally hired an outside group to build the website and in the nicest of term they did not deliver anything from any phase of the project. I have just recently take a little more control of website design and maintenance. I have not had the opportunity to look into the analytics side of our site, because I did not set it up.
I do know people read the source code, but I am unsure what you have seen that would elicit the smiley. I would like to know if there are any comments buried that put a bad light on my company. For I have edited out the “Jackwagon” comments the previously mentioned company put in there. I edited them to tell a more truthful story of their dealings with us.

Well, the “jackwagon” comments definitely earned you a smiley, nothing more sinister than that. And my filter doesn’t help you at all. Regardless… time to check out the updated headers:

This site was built incorrectly and poorly by [hotshot]. In particular, the global navigation system could/should have used jquery for a cleaner more, more professional codebase, but shortcuts were taken by [hotshot], unauthorized shortcuts.

The site is now taking on necessary corrections that should have not been needed if site was correctly built and to budget by [hotshot].

The global navigation, instead, uses a hack for the mouseover/mouseout triggers related to showing/hiding the secondary navigation bar. It uses what we call a "phantom zone," which is a set of four floating objects that surround the global navigation chunks, including the crumbtrail.

A jquery version was built (for fun), this was the fun that was had while wasting customer time in meetings, but was not implemented because this hack was used and because the developer wasted the customers time and money.

Alas, as entertaining as all this may be, the new guy has failed to fix one rather important mistake: THEY’RE STILL USING OUR GOOGLE ANALYTICS CODE! Oh well, we filtered their domain out of our reports a long time ago. Maybe he’ll catch it when he finally implements jquery.

Update: They did finally fix the GA code… but they never implemented jquery. This was 4 years ago and the comments still remain.

<!--var pageTracker = _gat._getTracker("UA-2#####4-1"); the jackwagons who originally built this site put in the wrong id. -->
Tagged with: , , ,
Posted in Blog, Tech Support, WTF

Equallogic Preemptive Drive Failure

We had another drive fail the other day.  And just like last time, upon further examination, we found other drives with bad blocks (my prior post about command line tools).  Last year, some of these bad blocks caused a tiny bit of data loss. This time, however, Equallogic support had a new trick up its sleeve…  for lack of any official term for this process, I’m calling it Preemptive Drive Failover.

Just like last year, our tech said we should go ahead and replace the other disk even though it was still operational; they had enough bad blocks to warrant an exchange.  You may recall my concern over the process…  last time, they said to just pull the potentially bad drive, wait for the rebuild on the spare, then pull the next drive… rinse and repeat.  Again, I’m pulling drives with bad blocks but still have to rely on other drives with bad blocks to rebuild to the new spare.

Once the initial rebuild was completed and a fresh spare was ready, they walked us through the process of mirroring a drive with raidtool.  Finally…  this is exactly the process I was asking for when we last spoke.

Here is the process they had us do.  First, you SSH as root (same password as grpadmin) to the member with the bad drives (must be the member, not the group IP)

# raidtool -m enable
enable mirroring
# raidtool -m rebuild_enable
enable mirror rebuild

Now that mirroring is enabled, you initiate it from the bad drive…  it automatically uses the spare.

# raidtool -m 11 initiate   
mirror drive 11, initiate
initiateMirror: disk=11
copy to spare initiated

This kicks off the copy process.  It will take about as long as a normal rebuild.  For our 2tb SAS drives, it takes about 9 hours.  I’m not sure why, but they had us disable mirroring immediately, while the rebuild/mirroring was running.

# raidtool -m disable       
disable mirroring

Once the rebuild is complete, the member will send an alert saying the source disk has failed.  ERROR event from storage array member1; Disk 11 failed.  Once this happens you need to disable rebuild before you can swap in the new drive.

# raidtool -m rebuild_disable

Now, once you put in a new drive, it becomes the spare.  Presumably, if there is an issue reading the questionable drive, it will be able to pull the necessary block from another drive in the array so you arent too reliant on questionable drives as you would had you just yanked the drive to begin with.

It doesn’t take much longer with this method since it essentially starts the rebuild before you swap the drives.  But you definitely won’t sweat the rebuild like I did the first time.



Tagged with: ,
Posted in Blog

The Day We Fight Back

Regardless what you think of Edward Snowden, his actions illustrate one thing clearly:  The data collected by the NSA is rife with potential for abuse.  No matter what slim chance of good may come from it, rest assured someone will use it for personal or political gain.  Weather it comes from another activist like Snowden, a foreign entity like Russia or China or our own increasingly repressive government, it will ultimately be used more often against us.

Despite how little the NSA is now claiming they actually collect (news stories are saying only 30% was ever collected), we all know the goal is %100.  Its time for this bad idea to come to an end.

If congress thinks this is a good idea, then maybe the NSA should publish the details of every interaction our congressional officials have with each other and their donors.


Screen Shot 2014-02-11 at 10.13.09 AMOn February 11, on the Day We Fight Back, the world will demand an end to mass surveillance in every country, by every state, regardless of boundaries or politics. The SOPA and ACTA protests were successful because we all took part, as a community. As Aaron Swartz put it, everybody “made themselves the hero of their own story.” We can set a date, but we need everyone, all the users of the Global Internet, to make this a movement.


Please take some time to let your elected officials know you want your freedom back.


Posted in Blog

30 Years of Mac

Screen Shot 2014-02-11 at 10.31.29 AM

Today marks the 30th anniversary of the Macintosh.

I got my first Mac and first computer when my uncle handed it down to me in the early ’90s…   And I was so disappointed.   All my friends had PCs and were on BBSs… Some even had color screens!  But that attitude changed quickly. I think my discovery of HyperCard (among many others) may have sparked my interest in programming and design.

I still have my old SE and we powered it on today for the anniversary.

Still runs great, though a little slow.

Posted in Blog

PowerConnect POE Power Cycle

We’ve been having an issue with the network in an adjacent office.  This office has 2 stacked Dell PowerConnect 6248P POE switches connected to our main data center stack with 10GbE over Cat6.  The problem is that if the gateway is unavailable for a fraction of a second, all of the phones in the remote office lose connection to the VoIP server and never retry (I think thats a shortcoming of the phone, but I’m not likely to get any real support out of them).

So, any time we reboot the switch or even clear xlate on the firewall, the phones drop off.  Our initial solution to this was to unplug/replug all the phones.  Which was tolerable when there were less than 15…  but we have more than 30 now.  I know theres got to be a way to reconfigure the switches to better handle this situation… unfortunately, I havent found it yet.

But I did start thinking about the CLI and realized we could toggle the power to each port by enabling/disabling POE.

Unfortunately, the interface range command does not work with POE commands.  Power isn’t a command if you select multiple ports.

6200Switch(config)#interface range ethernet 1/g1,1/g2
6200Switch(config-if)#power inline auto 
% Invalid input detected at '^' marker.

But it does if you select an individual interface.

6200Switch(config)#interface ethernet 1/g1
6200Switch(config-if-1/g1)#power inline auto

Unfortunately, that would be a lot of typing…  so this is where I enlist Excel to do my dirty work.  I wont bore you with the details, but you can download my Excel template below.

Basically, you select each port and disable POE with the power inline never command.  Then do it all again but with power inline auto to turn them all back on.

interface ethernet 1/g1
power inline never
interface ethernet 1/g2
power inline never
interface ethernet 1/g3
power inline never


interface ethernet 1/g1
power inline auto
interface ethernet 1/g2
power inline auto
interface ethernet 1/g3
power inline auto

Heres the Excel file I used to send a full (ports 1-48) POE power cycle to the switch.  Set the stack member number and delete the rows for ports you dont need.

Download Excel template



Posted in Blog

The Bieber Algorithm

So I had to reverse engineer some software we use at the office the other day.  It uses a hash to store its current users in a table, and we needed a way to monitor who was using the system when.

Their old version was easy…  it used to store the users and machine names in a table in plain text.  Users could only be logged out by clicking the logout button…  if they closed the window, their account was left in limbo until they returned.  The only official “supported” solution from the vendor was to login as that user then log them out.  But that was a security nightmare and a royal pain.

We developed a simple web page that listed all the users… If there were too many logged in, we could check to see if someone had forgotten to logout.  Then deleting their record would effectively kick them off and open up a new seat for a different user.

But then came an “upgrade” that introduced a “new” licensing mechanism.  It doesn’t leave as many users logged in like it used to, but I dont think they improved it much.  There are still many more users than we expect to see at some given times.  However, it did succede in obfuscating who is logged on.

I started by running SQL profiler looking while running the user list function within the software.  Found it only checks the licensed user table which uses the encrypted hash.

Noticed the hash was probably Base64…  tried decoding that, but no dice.  Since the new version is written in .Net, I started searching for Base64 c# encryption…  found a few good examples since .Net includes several methods for this, but most of them used rijendel or AES…  bottom line: they all used a damn good key and salt…  not looking good for blindly decrypting.

So my next idea was to take a look at the DLLs included in the software.  Did a quick search for a .Net decompiler and found JetBrains dotPeek which was free and seemed to be well respected.

15 minutes later, bingo: Security.LicenseKey.Decrypt(string EncryptedText)

This was more than I had expected.  Not only did I find the method, it showed the key and IV…  but then…  not sure if I can take their programmers seriously now.

Though, no one in their right mind would guess that key…  Security by celebrity?

In the end, since the method was exposed, my tool now uses the vendor’s dll to decrypt the hashes for me.

Tagged with: ,
Posted in Tech Support, WTF

Arduino & Woz

I spent some time this weekend working with a graphic display on my Arduino. I havent written a game since highschool, and it sounded like a good weekend project.


On an impulse, I picked up a small LCD from Amazon ( LCD4884). Its a little $15 b&w Nokia screen with 84×48 resolution. Unfortunately, it seems to be mostly for text as I couldn’t find a lot of help working with the graphics. So I wrote a few functions to help (I’ll post them later). I think I’m really just lacking some knowledge of arrays and byte logic… so my workaround may just be one of those things real programmers like to laugh at.

Now, when I say “write”, I just mean recreating/porting games I’ve seen before. Back in the early ’90s, I did both pong and breakout, among others, for the TI-82 but I’ve never done anything like this. I’ve forced myself to learn C# since its similar enough to… but I really waned to get deeper into C, not deeper into a microsoft mess.


Working on my weekend project, I didn’t really catch the coincidence until later sunday night when I went to a UofA lecture series to see Steve Wozniak. I knew his story well, but it didn’t occur to me until I was listening to him recount his time with Jobs at Atari working on both pong and breakout. Somehow my weekend project now seemed different… Woz had built these games in hardware in machine language… no microprocessor… no computer or ide to help develop and debug with. No, he did it all in his head.

He took questions for a few minutes afterward… I really wanted to standup and ask for his thoughts on Arduino and other maker projects.


Posted in Blog

Equallogic Command Line

Well, we had another failure late last month.  The rebuild went fine…  and quickly since I had another disk on hand.  That disk was supposed to be replace the other disk with bad blocks.  But I’ve been to busy and a little too trepidatious to break the array since its running fine and not complaining.  We have backups… I just didn’t have the time to rebuild all that.  After all, we’ve had 2 failed disks in our PS6100 over the last 3 months.

We just picked up a 2nd PS6100 and spent the last week moving volumes to the 2nd array.  So, now I feel comfortable breaking things.  And dell is going to bill us for the other drive if I dont send the bad one back soon.

But, which disk was it again?  I had notes of the conversation with EQL support, but I didn’t write down which one had the bad blocks and high error rates.  And the GUI wasn’t helpful at all…  it showed all driveds with no errors.

Screen Shot 2013-04-05 at 11.35.58 AM

The only way to know for sure is to check on the console or terminal.  Now, I had some notes on unsupported bash commands, but nothing was working.

SAN> su exec bash
You are running a support command, which is normally restricted to PS Series Technical Support personnel. Do not use a support command without instruction from Technical Support.

…but thats all it did.  It didn’t start bash, it just dumped me back to the same prompt.  Nor did it allow any special commands.  After trying a few things, eventually found that just “su” will get you to a support prompt:


“exec bash” still didn’t do anything.  But ? gives a list of commands…
cleanup-nas-service - cleanup-nas-service <nasServiceName>   nas - NAS support command  repl-use-jumbos - Configure use of jumbo frames between replicaton partners.  snmp - snmp  time-protocol - time-protocol ntp   time-protocol sntp alias - Performs text substitution.  clear - Clears the screen.  cli-settings - Specifies certain CLI settings.  exec - Executes a CLI script file.  exit - Brings the user up a command level from subcommand mode.   help - Displays information about the CLI commands.  history - Displays the command history.  logout - Logs out a group administrator.  stty - Displays terminal settings.  tree - Displays the full CLI command syntax in a tree structure  .  whoami - Display the user logged in to this cli session.  <cr>

I want to run diskview… but it still complains:

SAN(support)> exec diskview -i 0

Error: Too many parameters

So, its running exec and sending all the parameters.  Heres what I eventually did…

exec "diskview -i 4"

Analysis of drive 4:

Approved drive by, signature .

The drive has had 583410022 IOs issued to it and 37 errors. Access based on percent region of the drive below:
 0 - 25 : 43.3(252907936) percent(Count) / 8.1(3) Error percent(Count).
 25 - 50 : 50.8(296230990) percent(Count) / 91.9(34) Error percent(Count).
 50 - 75 : 3.5(20319965) percent(Count) / No Errors.
 75 - 100 : 2.4(13951131) percent(Count) / No Errors.

Current Preemptive removal status is: Drive remove has been requested.

That confirms #4 is our bad drive.

Though im curious about the “Preemptive removal status” part…  I mentioned on the initial call that I was nervous about pulling the drive, but the tech said that was the only way.  However, I though they should have a preemptive removal function that would prematurely start a rebuild with all disks still available.  Drive 4 still had 99% good sectors…  I’d rather rebuild without affecting the physical disks.  And if the rebuild fails, we could loose data and I’d feel 100% responsible.

Detailed error history(upto the last 10 errors): 37 total errors have been logged for this drive.

        At local time of Fri Jan 11 22:27:42 2013
                A Read at LBA 1176596223 for 1 blocks, 
                 with a IO Error(0x5) error with a sub error of Uncorrectable error(0x4)
                (ASC/ASCQ(0x11/0x1) = Read retries exhausted) 
                and a recovery time of 1 seconds.
            Special actions taken during error recovery:
                1 bad blocks were returned to raid.
                This IO required 1 retries.
                An error was returned for this IO.

thebn finally:

 exec "diskview -j"

shows us this table:

Screen Shot 2013-04-05 at 10.36.06 AM


Ultimately, the rebuild went fine and was back to 100% by late saturday night.  I changed RAID from 50 to 6 shortly after and it took much less time than I had figured.  And we havent seen any noticeable hit to performance.


Here are some of my search results that were helpful:


Tagged with: ,
Posted in Blog

EqualLogic with a questionable drive

Well… last month we had one drive completely fail, which is what these arrays are supposed to protect against.  However, when this happened, there were enough bad blocks on another disk that we had some affected volumes.

I’m used to dealing with dumb RAIDs that completely fail on things like these, so I was relieved that my year old EqualLogic simply told me which volumes were affected and moved on.

During our initial tech support call with Dell, we decided to go ahead and replace the disk with bad blocks as well.  They only overnighted one drive…  which is fine, you cant replace them both at the same time…  so we got the other later that week.

Now that the rebuild is complete…  I’m left with the unenviable task of breaking a functioning array and hoping it rebuilds.  Time to verify all my backups…

Screen Shot 2013-04-05 at 10.36.06 AM

Posted in Blog

PS6100 semi-catastrophic failure

A drive failed, at 4:00 on a Friday (2 weeks ago)…  Then at about 11 that night, the rebuild failed.  I thought it was the end of the world; especially since we’re severely overextended with our storage.

I’ve never experienced anything like that on a SAN, so I was very relieved to see it really didn’t phase it much.  I was afraid the whole thing would collapse and I would have to rebuild everything from scratch.

But we got lucky and the bad blocks belonged to only one important volume, the rest were snapshots and backups I hadn’t bothered to delete.  The only one in use was left as online-bad-blocks while the other backup volume was left offline-bad-blocks (but it was only a temporary replica from last year).

I just deleted the damaged snapshots and the old backup volume.  But the one in use (one of the first volumes I created) seemed to be working fine…  the EQL tech said it will only throw an error to the OS if it tries to read one of the bad blocks…  and we cant really tell if they were in use or just free space in the VM…  but if they are just overwritten by the OS, then the array will mark them as good and move on.  I migrated the 2 production VMs on that volume to another and it didn’t complain.

I’m about to pull the “bad” drive…  though the array still thinks its good, the tech and I both agreed that it had too many errors in its stats…  so they sent me another.  Decided to wait until the weekend before we chanced anything.

Now I’m trying to decide if I want to convert the RAID50 to a RAID6 (once the rebuild is complete of course) EQL seemed to think there wouldn’t be any discernible difference in terms of performance.  And I’d rather have protection against any 2 drive failures than hope the two that fail aren’t in the same set.  With drives as large as they are today, its just an accident waiting to happen.

All things considered, we’re very happy with our purchase…  now we just need another.

Tagged with: , , ,
Posted in Tech Support
Tag Coud
WordPress Loves AJAX