I'm sitting in the New York Café in Budapest, Hungary. It recently got redone after being in a sorry state since 1956, when it had a run-in with a Russian tank. It happens to be just two blocks from the college where my dad used to teach, so I remember walking by it when I was a little kid. Back then, it looked unceremonious. Now, with all the shiny gold plating everywhere, this place looks opulent and luxurious, if not a bit over-the-top. A smart girl once remarked that I have the tendency for big-picture philosophizing when surrounded by luxury, so here we go.
In 1995, Netscape offered the vision that one day, everything will run inside the browser. The operating system would be just another layer of computer architecture. The golden copy of your files would reside on central servers. Your word processor, spreadsheet application, and e-mail client would all be displayed inside a window with a meteor shower animation in the top-right corner.
Well, that didn't happen. Microsoft bulldozed Netscape and for a while, blue e's and rotating globes are still everywhere. But the vision is still alive.
Moving applications and data to the network makes a lot of sense. For storage, Google and Amazon can offer higher reliability at lower cost than home-brewed solutions. In fact, consumers often completely forget to backup their data. For applications, Firefox is available for all popular operating systems, and may easily become a popular application platform with some massaging. Imagine being able to log on with any computer in the world and having your data and favorite applications instantly available, just like you have it at home.
We're not there yet. The Web 2.0 boom introduced many applications that were formerly desktop-only territory: Google Calendar, GMail Chats, Kiko, Writely, Pixoh, and DabbleDB come to mind. But imagine spending an entire workday using only your browser, without any desktop applications. For anyone who uses more than just e-mail, that day would be a very unproductive one.
One of my pet problems is engineering, not science; the second is science, not engineering:
- Offline Access
- Encrypted, searchable storage
Problem 1: Offline AccessWe won't be able to cover the entire planet with always-on Internet access. We need a simple, standardized way to run web-based applications offline.
Earlier this year in Mountain View, I saw huge boxes on lampposts, designed to cover the area with WiFi. This is a great idea for populated areas, but don't extrapolate too far: Even if these boxes were a lot smaller and had much higher range, even if UMTS became free and universally available, even if Connexion receivers were reduced to the size of cell phones – even then, we wouldn't be able to cover the entire planet with Internet access. There will always be a long, dark train tunnel that will spoil the master plan.  On top of that, I doubt all this infrastructure could be provided for free, and my tolerance for paying $6/hour for spotty T-Mobile Hotspot reception is low.
An online application becomes completely useless the moment you lose connection. That's why we need to find a way to make online applications work offline.
This isn't the first time I complain about this. In a previous post titled "How do we solve the offline problem?", I described the technical challenges and options.
Today, it seems to me like the most natural way to do this is via a mechanism similar to extensions in Firefox.  First, take a browser you can run on any platform, then add a mechanism to easily create applications that perform three things:
- Caching: keep a copy of your online data locally.
- Presentation: Display the UI in the browser, either by faking it or by actually running an application server locally.
- Synchronizing online and offline data.
Creating offline-enabled web applications will take a lot of work. But without an offline option, web apps will never overtake Microsoft Office.
Problem 2: Encrypted, Searchable StorageWe need to devise a scheme where we can store encrypted data remotely, with the ability to quickly and efficiently search it.
Search is a killer feature. The power you gain from being able to search your own data as quickly as you can search the web is immense. The more data you have, the more useful search becomes.
Recently, there has been a proliferation of new online storage providers, and there are rumors that even Google wants to get in the game pretty soon. For a list of current players, check out this comparison chart. Some of them, like Omnidrive, offer encrypted files, but some aren't even truly secure from a cryptographic perspective: XOR-ring data with the user password doesn't really help.
The key point is this: Many users won't completely trust their storage provider, and won't store the golden copy of their files online, unless they're really, positively sure it's encrypted, and no one else can read it. As an extra benefit of encryption, the storage provider won't even be able to hand data over to the DOJ for their 'statistical evaluations about children accessing pornography.'
That's why encryption should be default. It should take place on the client side, and storage providers should never even see user data in plaintext. 
Storing all files in an encrypted manner has a huge drawback. The storage provider won't be able to index and search them anymore. Unless, of course, you found a way to encrypt data but still be able to search it, without losing security. And that's exactly what we need.
Obviously, I'm not the first person to think of this problem, and there's plenty of research on this topic. For example, there's this paper by Song et al. titled "Practical Techniques for Searches on Encrypted Data". You can safely skip to Section 5.4, where they discuss building indexes. Their solution is relatively simple, but requires two round-trips to the server and the storage provider is still able to learn some information about the documents from the user's access patterns. But that seems tolerable.
Two side notes: Any client software for accessing encrypted software would need to be open source, at least in the core parts. With a closed-source client, how would your users know you're not really sending along your encryption key? Also, while it looks like I'm talking exclusively about online storage, this also applies to all data stored in a web application. Wouldn't it be great if Google Calendar didn't know the plaintext of your appointments but sent you an encrypted record which is then decrypted and rendered in your browser?
My opinion is that encryption should be standard in any kind of online storage solution. Without search, however, online storage is useless.
ConclusionsMy speculation is that the current crop of web developers will at first resist solving problem 1, because they're too much in love with their server. Also, someone needs to come up with a good example solution that everyone else can copy - much like GMail and Google Maps first came up with neat uses of AJAX. This may be very hard, as it may require hacking deep inside the browser.
As for online storage, I believe it is an important problem. But will users appreciate this functionality? Not before the media makes a huge story out of teenagers hacking into some celebrity's online picture collection, or Chinese students getting arrested at a dissident meeting they had entered in Yahoo Calendar. With some public awareness for the issue, I think people will flock to the provider offering encryption, and they'll be happy to see a search box.
Thanks to Markus Egli and Bálint Miklós for reviewing drafts of this.
 I guess only someone who lives in Switzerland would come up with a train tunnel as the primary example.
 Dear readers, if you have an idea about how this can be done with current Firefox extensions or other, existing technologies, let me know.
 A successful online storage solution needs far more than encryption, the most important aspect being extremely good desktop integration. Also, with any encrypted storage solution, we'd need to train the user to keep offline backups of his encryption key: Without the key, all his data is lost.