charlie's blog

Thursday, May 22, 2014

identity is hard

Equivalence

Take a look at the following pairs and decide whether the two things are equivalent (the same):

"cat" vs. "cat"
"dog" vs. "Dog"
4 vs. 2+2
"color" vs. "colour"
"wanna" vs. "want to"
1+2 vs. 2+1

Your specific answers probably differ from mine, but I bet you said "the same" for some, "different" for some, and maybe "it depends" for some.

For instance, I'm sure we agree that "cat" and "cat" are the same. We would probably say that "dog" and "Dog" are the same thing too, at least most of the time. What about "color" and "colour"? I'm betting that most English speakers would say they're essentially the same thing, just two ways of spelling the same word. An etymologist might disagree and say that they differ in some interesting technical sense. Someone with strong American or British pride might give you an earful.

Likewise, the difference between 4 and 2+2 is arguable. In some senses they are the same thing, since they can both be seen to represent the quantity 4. However, if you wanted to tell someone what time your kids come home from school, you probably wouldn't say they come home at "2 plus 2 o'clock", so they obviously aren't totally interchangeable.

The crux of the matter is that equivalence is context dependent. Whether 4 is the same as 2+2 depends on whether you're in a mathematical context or a social context. The equivalence of "dog" and "Dog" might depend on whether you're typing the words into a search engine, where you'll probably get the same results for both, or using them to start a sentence, where one is correct and the other is wrong.

So why do I care about equivalence being fuzzy? Mainly because computers aren't very good at "fuzzy".

Fuzzy problems

One of the most common things computers do is compare things. For instance, a tax program might compare your income against some threshold value to determine whether you owe more or less money. Or it might check to see if your age is greater than 65, to help determine eligibility for retirement benefits. Comparisons that involve numbers are generally pretty easy, so computers do reasonably well at this. But what happens when we try to compare something like a name?

Imagine you have a customer account with www.bunnyslippers.com. When you registered the account last year, you provided your name (Fred Flanders) to create a customer account. Each time you log in, the server looks through the list of known customers and decides whether any of those names matches "Fred Flanders". If it finds one, it can verify the associated password and allow you to proceed.

Now what happens if you try to log in and accidentally type "FreD Flanders" instead of "Fred Flanders"? If it was a human handling this sort of request, they would probably not even notice that you accidentally capitalized the D in "FreD". On the other hand, a computer might or might not see the two as the same, depending on how careful the programmer was being. By default the computer actually sees the letters as numbers; 'd' is 100 and 'D' is 68. So when you ask a computer whether "Fred" and "FreD" are the same, it sees two different sets of numbers, and it says they aren't the same.

So why not just tell the computer that "d" and "D" are the same? Okay, done. Hopefully you don't mind if Microsoft Word now occasionally replaces all the d's in your term paper with D's. They're the same now, so what difference does it make?

Hopefully the problem is becoming clearer now. Sometimes we want our computers to see "d" and "D" as the same thing, and sometimes we don't. The difference is in the context.

Computers aren't fuzzy

The fundamentals of computing are deeply rooted in mathematics. The earliest computers were designed to calculate solutions to complex ballistics problems, and for the most part they remain glorified calculators. The only things they can really do are math and various operations on individual bits. In this sort of basic math, there's not a lot of room for context. The number four is always the same, so there's not much fuzziness to deal with.

As befits their mathematical underpinnings, most programming languages support operations like addition, subtraction, and the like. They also support comparisons, including tests for equality (many languages use "==" for this instead of "=", since the latter is often employed for another purpose). These operations make good sense for working with numbers, but it gets trickier when they get applied to other sorts of data.

For dealing with textual data, computers use something called a "string", which is a sequence of characters (single letters). "Fred" can be treated as a string, composed of the characters "F", "r", "e", and "d". Most languages allow the equality test (==) to be used on strings, and here's where things get tricky: by default this test looks for a very strict numeric equivalence, the kind that says "d" and "D" are not the same. The human programmer may not have intended that, though; the programmer is quite often aiming for some fuzzier form of equivalence.

To deal with this, computer languages often provide various specialized ways to compare strings, which can be used in different contexts. One way is the super-strict "no differences whatsoever", but you can also specify a comparison that ignores case (so that "d" and "D" become the same), or even a comparison that first applies all kinds of interesting linguistic normalizations to smooth out variations. The strict comparison can be used in strict contexts (e.g. verifying passwords), and the looser comparisons can be used for things like finding names in customer databases.

Identity is fuzzy too

So far, all the examples I've used to talk about equivalence have been simple, interchangeable things, like apples. Two totally identical red apples are more or less the same, and you'd probably be equally happy to have one vs. the other. There are lots of things like this in life, but not everything fits that description. If I were to replace your favorite old leather jacket, which smells like your dad and has years of old memories attached to it, with an old leather jacket I found at Goodwill, you likely wouldn't be happy at all. As another example, if you repainted your red Ford Mustang to be chartreuse, you'd still expect everyone to know it was your car, right?

What I'm getting at is the concept of identity. Just like equivalence, identity is something we understand naturally and deal with all the time. Everyone understands that the lime green car you're driving today is the very car you were driving yesterday: it's your car, the particular car that you own. Likewise, your old leather jacket has specific unique value to you; it has a particular identity which distinguishes it from other similar leather jackets. Identity is closely related to continuity over time; the significance of your jacket's identity has a lot to do with it being the very jacket that you had on all those previous occasions in you life.

The word "same" can refer to both equivalence and identity, even when the two concepts are in opposition (that pesky fuzziness again). For instance, if we're talking about identity, I might say that the green car I see today is the same car that you were driving yesterday (i.e. both cars were you particular Mustang). However, visually the cars are not equivalent, so I could just as correctly say that the car is not the same today as it was yesterday.

Computers and identity

Programmers often refer to the "identity" concept above as "reference equality" (as distinguished from "value equality", which is the equivalence concept). For reference equality we usually pick some stable identifier, such as a VIN or an email address, and use that for identity. Reference equality is often less fuzzy than value equality, but there's still room for error; for example, if you use an email address as the identifier, do "fred@domain.com" and "Fred@domain.com" identify the same account? Probably they should.

Sometimes computers and programmers don't pick the right type of equality to match our expectations, or they don't implement it the way you'd expect. A good example is that term paper you wrote last week on your computer: "The Strange World of Ocelots". You probably view that paper as having a particular identity, after slaving over it for hours. If I asked you where the paper is right now, you could presumably tell me what computer and folder it sits in. You might even have created a shortcut to it on your desktop.

So what happens when you rename the file to "Ocelots - Strange but Wonderful"? I'll tell you what happens: the shortcut on your desktop may not work anymore. That's because most computers consider the identity of a file to be solely a matter of the file's name (as well as the names of the folders it lives in), and you just changed the name. Of course, this behavior seems totally wrong to the average person, because in our minds, the paper on ocelots is still perfectly well identifiable as itself.

By the way, this example does actually work in newer versions if Windows, thanks to the Distributed Link Tracking service. However, the need for a specialized service just to make this work just emphasizes the fact that this is a tricky problem.

Where am I going with all this?

The point of all this is that equivalence, identity, and sameness are hard; hard to describe, and hard to correctly implement. I don't have any magic solutions to offer, but I believe that thinking more about this topic can help programmers write better software with less effort and pain.

Some specific ideas for programmers to keep in mind (including my future self):

  • When designing new systems, think about what types of equivalence/identity might be involved, and what behavior the user will expect.
  • Be careful with standard/easy ways of comparing things (e.g. operator== and Object.Equals). Does the easy way actually have the semantics you want?
  • Better yet, be explicit about how two things will be compared. For instance, use function overloads which explicitly specify string comparison modes.
  • Make sure you understand the language and use it as intended. For example, C# provides a very simple reference equality for reference types. It also uses value equality for built-in value types, which means you should usually do the same with your own value types (so you don't surprise people).
  • Find or build automated tools to help verify your code's correctness. For instance, you can use FxCop to verify that you're explicitly specifying string comparison modes whenever possible.

Labels:

Sunday, February 9, 2014

how do I write an Alt+Tab replacement program?

Over the years I've done several Alt+Tab replacement projects, and recently I started a new one. Unfortunately I found that the techniques which worked under Windows XP do not work under Windows 7 (and presumably not under Windows 8 either). I did eventually get it to work, and created this page to document what was required.

Here are the essential ingredients for an Alt+Tab replacement (a "switcher") under Windows 7:

#1: Capture Alt+Tab

Under Windows XP this was easy, you would just use RegisterHotKey and override the default handling of Alt+Tab. Unfortunately this does not work under Windows 7*. Instead, the best option I found was to set a Windows hook using SetWindowsHookEx, specifically a low-level keyboard hook (WH_KEYBOARD_LL). From here you can detect the Alt+Tab key combination and invoke whatever UI you want.

* Actually, it might work once you enable uiAccess (see below). I never tried it again after switching to Windows hooks.

#2: Enumerate windows

A switcher needs to know which applications it's switching between. There are a few ways to do this, and it's not particularly complicated. Two options are EnumWindows and the UI Accessibility framework (see System.Windows.Automation). The latter has more functionality but can be less performant if you're not careful.

As a side note, you can also use the UI Accessibility framework to determine the order of buttons in the Windows task bar, which might be useful depending on what behavior you want in your switcher. I did this by finding the Explorer process and searching its windows for one with a ClassNameProperty of "MSTaskListWClass". The child windows of this window seem to be the taskbar buttons. There might be cleaner ways to do this, though, and no doubt this is subject to breakage in future releases of Windows.

#3: Actually switch apps

Once the user picks an app to switch to, your switcher needs to actually perform the switch. This is the part that gives people the most trouble, and the one I fought with the longest. I tried SetForegroundWindow (unreliable), SwitchToThisWindow (also unreliable), AttachThreadInput (unreliable and probably a bad idea), and no doubt some other things I already forgot about. These do not work due to deliberate (and good) security limitations in Windows.

What does work is to set uiAccess to true in your application manifest (see this MSDN overview). You do not need to change requestedExecutionLevel (the app does not need to run with elevation), you just need uiAccess="true". In order for uiAccess to work, you will need to sign your executable (self-signing is a good option for initial development) and install it in Program Files. If you fail to do either of those things, your app will not be able to launch.

Once you enable uiAccess, your switcher will be ready to go. I ended up settling on SwitchToThisWindow for the switching, but I think SetForegroundWindow would probably work as well. For debugging, usually I just temporarily change uiAccess to false and live with the fact that some switches will fail due to security limitations.

Conclusion

That's it! Hopefully this will be useful to other developers; figuring this out (particularly item 3) was a long and frustrating process, and hopefully this can save people some pain.

Sunday, August 18, 2013

a note about the SDLC

I've been reading a lot of resumes recently, as we slowly grow the software group where I work, and I've noticed a lot of folks referencing the Software Development Life Cycle (SDLC). A large percentage of these resumes indicate that the applicant is well versed in the "entire SDLC", and they often go on to list the SDLC in detail as something like the following:
Requirements-gathering, Design, Development, Testing, Deployment
My concern here is that almost nobody includes "maintenance" or "evolution" in their list.

I'll say up front that I am not an adherent to any specific formal development process, be it Agile, Waterfall, or any of the other approaches out there. I certainly don't claim to be an expert on the SDLC. I can say this with complete confidence, though: no matter what process you follow, software always requires maintenance!

When I read these resumes I get the impression that the person thinks software development is like making a sculpture: you decide what it should look like, do the sculpting, make sure it looks like you intended it to, deliver it to the customer, and you're done! This is emphatically not how software development works.

A much better way to think about software is like a garden. Once you plant it, your job has only begun. You get to spend the next N years tending the garden: pulling weeds (fixing bugs), occasionally adding new plants (new features), and periodically moving things around (refactoring). Whether or not you do this work, someone has to do it, or the garden/software turns into a jungle and eventually dies.

I won't say that I immediately discard any resume that lists the SDLC and leaves out maintenance, and I realize some folks just haven't encountered this part of the process yet, but seeing this on a resume is a huge red flag for me.

Thursday, July 18, 2013

why you should learn powershell

Are you already a user of Powershell? You should be, if you work with Windows at all. Microsoft describes Powershell thus:
Windows PowerShell® is a task-based command-line shell and scripting language designed especially for system administration.
That description is accurate but it really only scratches the surface. I would say Powershell is useful for anything from simple repetitive tasks to complex scripts or even full-on programs. For instance, my wife and I took a trip a few years ago, and we both took several hundred pictures with our respective cameras. I wanted to be able to put the pictures in an online album and have them arranged in proper chronological order. My pictures had names like P123456 and hers had names like CIMG1234; for both cameras the numeric part would increment with each picture. The problem is that standard alphabetic sorting would result in her pictures and mine staying separate. Enter Powershell: with one line of typing I was able to rename all the pictures so that an alphabetic sorting would properly show the pictures in chronological order.

That's probably not enough to sell you on the idea, so here's a slightly more structured pitch:

It's great for customer support
Whether your customers are the normal kind (the ones that pay you in dollars) or the informal kind (the family members that pay you in cookies), Powershell is great for customer support. All recent versions of Windows come with it already installed, so if you know how to use it, you have a ready-made tool for solving problems. No extra work required. This can be a life saver if you don't have access to your regular geek toolkit.

You don't have to be a programmer
Powershell was designed to be used by non-programmers. You don't have to understand all the rules of C# or Java, you just have to understand a few simple concepts and be willing to experiment. You can also find tons of recipes online and in print for solving different problems. You'll find you can do amazing things with very little typing.

It will make you smarter
If you already know how to program, Powershell will help you learn to think in different ways, since working with the object pipeline is comparable to functional programming, whereas most programmers do imperative programming. Or you can stick to what you know and write it like it's C. Or you can mix and match!

It makes a great bridge to "serious" code
Since Powershell is built on the .NET platform, it easily connects to existing C# components. This means that you can do anything that the huge .NET Base Class Library can do. It also makes a great tool for automating large applications, especially if you take the time to build proper cmdlets.

There you have it - four good reasons to learn Powershell. What are you waiting for?

Sunday, May 12, 2013

fractals

I'm reading The Most Human Human and ran across the following:
I tend to think about large projects and companies not as pyramidal/hierarchical, per se, so much as fractal. The level of decision making and artistry should be the same at every level of scale.
I really like this way of looking at things, and it matches my personal experience very well. I can easily imagine our company this way. At large scale you can see big departments and projects and the decisions that guide those. Zoom in to the level of one department or project and you find more interactions and decisions, which have smaller significance, but which still take time and creativity to perform well. Zoom in even further and you find individuals, who are applying themselves to small portions of a problem space, making decisions and looking for elegant solutions to whatever bit of the problem they're working on. Every level matters.

I'll leave you with the following lyrics from Mandelbrot Set by Jonathan Coulton. The song really has nothing to do with the above (other than being about fractals), but it's fun.
Mandelbrot Set, you're a Rorschach Test on fire
You're a day-glo pterodactyl
You're a heart-shaped box of springs and wire
You're one badass fucking fractal
And you're just in time to save the day
Sweeping all our fears away
You can change the world in a tiny way