Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
How to write cross-platform code (backblaze.com)
109 points by pmarin on Oct 22, 2010 | hide | past | favorite | 40 comments


I would say building for all platforms to verify you didn't break anything is the perfect job for a continuous build and integration server.

Simply don't accept the changes if the code doesn't build cleanly and passes the test suite on all platforms.

Why bother the programmer to remember to do this?


By this I mean that once a new feature is implemented and tested on one platform (let’s say Windows), the programmer checks it into Subversion, then IMMEDIATELY checks it out on both Macintosh and Linux and compiles it.

Yuck. Here's how we do it:

  $ git commit
  $ git push origin HEAD:refs/for/master
The commit shows up in gerrit for code review. Immediately the buildbots (one for each platform) notice the change and attempt to compile it. The results are reported back to gerrit and a failed compile blocks the change from being submitted.

Assuming the change compiles cleanly and passes code review, it can then merged to the master branch. Importantly, broken code doesn't get checked-in in the first place.

Now, we happen to use git and gerrit and buildbot, but any decent system should have code review and a build farm that allow changes to be tested and reviewed before they are checked in.

If your developers aren't following some procedure (e.g., test compile on every platform), consider it may be because there is too much friction. Instead of firing the developers, ask why they aren't following it. Maybe other developers are impacted too and just have a higher tolerance for menial tasks, but those tasks are still costing the organization.

http://code.google.com/p/gerrit/

http://buildbot.net/trac

http://github.com/jaysoffian/buildbot/commit/e4c0d458374b9a0...

And if you happen to use hudson:

http://wiki.hudson-ci.org/display/HUDSON/Gerrit+Plugin

http://wiki.hudson-ci.org/display/HUDSON/Gerrit+Trigger


Agreed. Using git or mercurial would get around this. I worked for nearly a decade on cross-platform code that followed a similar methodology.

Our rule was the code most compile w/o warnings on 3 dissimilar platforms before you could commit it. We were able to do this easily because we had network-mounted home directories, so when you logged into another machine, our development tree was automatically mounted. This made it easy to compile without checking in, then checking out.


In our case we have a buildbot attached to a subversion along with a try server.

You test your changes on all the various platforms by pushing it to the try server, it compiles on the various platforms we have set up at the moment (FBSD, Mac OS X, Windows, Ubuntu Linux) and gives you a status report on what if anything broke. If nothing is broken you can push it out to Review Board, at which point someone reviews the code.

Once the code has been reviewed and ship-it has been tagged on RB, the code is checked into subversion proper. At this time the buildbot picks up the change and does a quick (svn update) build to make sure nothing was broken, and then does a full (rm -rf buildir; svn co) build at the top of the hour.

Reports get mailed, code is checked in.


The thing that I most favor in cross-platform programming is the code quality. Heck, I might almost say that even if you only intend to write a program for Linux, make sure it compiles and works on Windows and OS X too. (Well, almost.)

Somehow, almost magically, if you have two or preferably three platforms the relevant code abstracts itself nicely out of what is the low-end mess composed of platform specific concepts. I don't mean merely thinking in terms of FooHandles and BarStreams instead of "file descriptors" but the whole process of thinking that happens when you have to separate, in your brain, that what is "platform level code" and what is "problem level code".


It's worth pointing out that even 15 years after the Netscape IPO, it's not possible to write cross-platform networking code for Windows and Unix using C or C++ without using a 3rd party library!

On top of basic socket support, you then have to add support for the common protocols (HTTP, SMTP etc.). I think this alone explains the popularity of networking "batteries included" dynamic languages like Python, and Java's status as the de-facto enterprise standard for network programming. Standardized languages like C and C++ move too slowly for the internet world.


Those are all v sensible rules. You could give everyone an OSX desktop with Windows and Linux in VirtualBox, using a shared folder, then you could do the test build locally without having to bounce it off version control in the middle.


I had a job with exactly the dev environment you're describing (plus an HPUX box and a Solaris box), and it worked quite well. Sadly we were stuck using Visual Studio 6 in Parallels for development, but otherwise it's not a bad idea.


Or use buildbot with a tryserver. Have it handle the task of compiling on the various different platforms and giving you back errors.


In my work, I end up writing code that compiles on a lot more platforms than that. These are my thoughts, I suspect I work at a much lower level than the author, so the disagreements I have are likely just because of that:

1) Don't use C++. Not all platforms have a c++ compiler.

2) Get your build system right. Don't let it get in your way.

3) Avoid #ifdefs. These don't scale particularly well as you increase the number of platforms. In the example in the article with the file size function, there would be a completely separate file with this function in it for each different version. The build system will compile the correct one.

4) Use plain boring ANSI C.

5) Developers don't have to compile on every platform (doesn't scale when you support bazillions of platforms and toolchains). If something is broken on your platform you fix it. If someone breaks something, you politely let them know what the problem was.

6) Avoid undefined behaviour. This is harder than you might think. Different compilers will be able to do different optimisations, and they will take advantage of different bits of undefined behaviour. Classic problems include overflowing a signed integer, right shifts on negative numbers, modulo on a negative number, type punning for endian checks, etc.

7) I agree with the point about standard 'C' types. However, there are a few extra bits. First, know what the types actually mean (char >= 8-bit, short/int >= 16-bit, long => 32-bit). Don't make any assumptions that aren't in the C spec. You might then want to put typedefs on top of these types to give them a bit more meaning.

8) Don't assume there is a FPU. Be prepared to do fractional maths in fixed point. There will always be float/double types, but they will be so dog slow on some platforms, and the memory cost of bringing in the library for them will be bad.

9) I disagree with UTF-8 for all APIs. Define a string abstraction, so on platforms (or rather, "on windows") you can have UTF-16 strings passed in and operated on without a conversion, and everywhere else you can use UTF-8. Avoid paying the conversion penalty on every operation.

10) Think about memory management. When memory is allocated, the malloc-like thing that is doing it should know what the memory is actually for. On some platforms, you will want to put different allocations into different spots in memory (this is also a nice place for another abstraction, since on some machines there won't be any different types of memory).

11) Don't avoid undefined behaviour by writing a single wrapper function (e.g. "safe signed integer add") and using it everywhere. The behaviour is typically undefined for a good reason (since different platforms will natively want to do it different ways). In the integer add example, you might want to assert that it never overflows, wrap around on overflow, truncate on overflow, return an error if it overflows, etc. Each of these things will have their spot, so don't bunch them all together.

12) Code that might want to be implemented in assembly should live in a small .c files. Then when you go to write it, you can just do it bit by bit (and again, use the build system to tell it how to build it, don't hack it with the preprocessor).


>> First, know what the types actually mean (char >= 8-bit, short/int >= 16-bit, long => 32-bit) ... You might then want to put typedefs on top of these types to give them a bit more meaning.

Even better, use stdint.h:

uint8_t, int16_t, etc. Completely unambiguous to anyone.


Yes, however to get these, you are now in C99 land, which excludes you from some compilers (the obvious one is the MS compiler). What we do is define our own compiler abstraction which can use stdint.h on C99 compilers, and typedef standard types into C99 versions for other compilers.

So, once you have a working stdint-like thing you then usually want the "least" (or maybe "fast") variants of them instead:

uint_least8_t, int_least16_t etc.

These have the advantage of being actually guaranteed to exist (the exact size variants are only required to exist if the compiler exposes those types, and they are 8, 16, 32 or 64-bits (from memory)). This means that if you are running on a 24-bit chip (e.g. char==short==int==24-bit, long==48-bit) your code will still work good (where not working good could mean that it doesn't compile because it is missing types, or it could mean that it compiles, but is really slow as it is emulating un-naturally sized types)

On the other hand, the type int_least8_t must exist. (as must 16, 32 and 64). However, to do the 64-bit type, you have to be careful, as C90 doesn't require a 64-bit type, so a fallback abstraction usually needs compiler specific tricks ("long long" is a safe bet).

Of course, if you actually wanted a type that is exactly 16-bits wide, then use these types, as that is what they are for... but usually you don't care if it is bigger.

And this is why the normal char/short/int/long types are good: They don't over-specify.


what sort of programs are you working on that must run on platforms that lack a C++ compiler and need to be portable? (I am legitimately curious)


Tiny embedded things from people you've never heard of.

Often, I won't even get to see the compiler for the platform, I'll just write the code, send it to a completely external person, and say "this should compile and run on your platform, I can help if it doesn't".

The core of what I do is audio DSP, so it has to be memory/cycle efficient too, otherwise no one wants to use it.

But as soon as you start developing for these platforms, you then need to build testing tools which also work there, and all the other fun stuff that goes along with it, so it ends up being a lot more than just the core DSP stuff (which is where most of the platform specific stuff comes for me actually, as the core DSP stuff is just memory in memory out type stuff with no side effects).



Probably embedded - many more constraints than desktop environments. ANd their C++ support is usually problematic.


Yes. But I suppose the thing about embedded stuff is that if you write good "embedded" code (whatever that means), then it works well everywhere. If you write good "desktop" code, then it probably won't work well on an embedded system. So, when I'm doing my own hobby projects which are never going to go onto hardware, I still write it as if it might end up on a whacky chip one day, because when you do it this way, C seems to smile on you, and have exactly all the tools needed to express what you mean without saying more than you mean.


-Avoid #ifdefs.

Agreed. If you need, put the ifdef code in a utility/library. e.g., instead of ifdef'ing your code for pthreads and windows threads, build a simple thread library to abstract the two to a common api.


> Classic problems include overflowing a signed integer, right shifts on negative numbers, modulo on a negative number, type punning for endian checks, etc.

It's not just right shifts of negative numbers. It's also all shifts by the size of the input or larger.

Yes, when 4 == sizeof(int), shifting an int by 32 may have the same value as shifting by 0. Shifting by 33 may have the same value as shifting by 1. And no, this isn't a signed/unsigned problem.


Yes, that is another classic one. And this one I think will hit people even when they aren't doing embedded stuff. I seem to recall a compiler optimisation which would make the assumption that the shift count would be well defined, and so it could then do some other optimisation (e.g. maybe that shift count was then used in a loop later, you can then assume that the loop runs less than 31 times)

Also in your example, you should have said sizeof(int)*CHAR_BIT == 32 :)


They aren't entirely bad suggestions, but it can be improved somewhat, depending on taste:

>Rule #4: Use only built in #ifdef compiler flags, do not invent your own

Actually you can separate the source code into src/linux src/w32 src/mac -- but that is just my preference.

>Rule #9: Require all programmers to compile on all platforms

No no no -- build has to take no more than a make command. Anything else is just domed to failure.


> Actually you can separate the source code into src/linux src/w32 src/mac -- but that is just my preference.

IMO the author suggests to develop in a single source tree which compiles on every platform for very good reasons, in particular to minimize duplication and being able to update a single method in a single file to work on all platforms. Your suggested multi-directory approach requires to duplicate most os-specific methods and classes. Depending on the situation one or the other approach might be better, but I think the author has a point in many of them.

> No no no -- build has to take no more than a make command. Anything else is just domed to failure.

Building with just a "make" should be fine on Linux, but for anyone who downloads the source code to a Windows/VS or OSX/XCode machine, it's great being able to just open the project in the standard editor and compile it.


Our 'make' vs Visual Studio issue is: to add a new module in Visual Studio you add it to the project. VS automatically getting included in many makefile scenarios. Which causes a common bug: a _WIN32-only module gets built on the wrong platform because the developer forgets to put the module in the makefile exclusion list.

Small matter, easily detected and fixed in the first build, but would be nice to automate somehow.


CMake (or perhaps Scons, but I have no experience with that) might be useful for you. Although going through the motions of writing build scripts for you build system might not be something you want to endure.

In any case, I used CMake in one shop I worked in and it was nice to have the build files all managed in one place. We were able to manage builds for Windows, Linux, Mac, and Solaris.


Oh I didn't mean that you should put all your source code in those folders - only that which is different for each platform, and if that is only one or two methods in each class then that is not a problem as C++ is much more flexible in that regard than C# or Java.


An #ifdef _WIN32 is a platform ifdef. Especially to cross-compile Unix flavors, it can help to use capability ifdefs, such as #ifdef HAVE_VSNPRINTF. This approach replaces if-then-else logic on platforms in the source with tests in the configure (or CMake) script.

Nice article. Is the GoF Abstract Factory pattern not popular for #5, "developing cross-platform base libraries?"


Care to elaborate on why you chose C/C++ instead of Java for the backup client?


He doesn't have big enough patent warchest?


My knowledge about Java is several years out of date, but I know that Java couldn't detect symbolic links at one stage. For a backup application, this sort of thing is useful. Similarly, think of all the other whacky platform specific file things, that probably aren't in the Java view of the world (resource forks in OSX, alternative streams in NFS, device files, named pipes, sockets, etc). I can imagine that writing a backup application that can't see past the abstraction chosen by Java would cause a lot of trouble.


Rule #7


As console/pc video game programmer I can just say:

Make it compilable on your target platforms: PC, Wii, PS3, Xbox, what other is needed.

Don't care about other platforms, until they arrive.

Instead of #ifdef, try to find the pattern in your code, and use select macroses such as:

PC_WIIXBOXPS3(pcValue,consoleValue) or WII_XBOXPS3_PC(wiiValue,xboxOrPs3value,pcValue)

wrap align, threadlocal, etc. as macro-define.

The UNICODE advise is wrong. These system do not have even working locales, so don't trust strupr(), strlwr(), toupper() - etc. to work correctly for your localized game.


Check out LiveCode (formerly known as Runtime Revolution), it is a cross platform language + IDE. It is a modern descendant of Apple HyperCard that is able to deploy Desktop Apps for Macs, Linux, Windows. Has an engine like PHP for Web Application development and soon will be available for iOS.

I've been using it for the past six years to do most of my development and am quite happy. I am able to develop on my macs and deploy anywhere.


sorry forgot the url for LiveCode http://www.runrev.com


Lots of good advice, but watch out for Unicode. IIRC, you can have illegal surrogate pairs in NTFS filenames (which are a kind of naive UTF-16), which renders them unconvertible to UTF-8. Certainly an edge case, but that's where bugs come from...

(BTW, what really shocked me about this was that it means some file paths can't be turned into file: URLs! I had assumed you could always do that.)


>Rule #9: Require all programmers to compile on all platforms

... and have a build server do this automatically & email the results. Manually building for each platform if there are more than two seems like quite a hassle and you know programmers will skip over it for minor changes.


The hardest part of cross-platform stuff is the GUI, and the solution given here is "don't make it cross-platform."

Which is really the right answer. But it makes the actual cross-platform stuff trivial in many apps.


this doesn't sound realistic


It is realistic. I worked on a cross-platform product for nearly a decade that followed a methodology very similar to this. Our code worked on 16-bit, 32-bit, 64-bit, big-endian and little-endian systems. Following these rules did not slow down development noticeably. In fact it became second nature.


The down-votes on parent are misplaced. Why not enlighten with comments about how it -is- realistic?


Because he/she doesn't explain why it is not realistic, his/her comment doesn't add any value to the conversation.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: