Farewell Etherpad

My new role?

TLDR; I’m announcing my retirement from actively maintaining Etherpad.

The Etherpad project is great and will continue to go on without me, I have no doubt of that.

4 years, 100+ plugins, ~30k lines of core code and ~1.5k core commits is in my opinion a decent contribution to open source and an open web.

Etherpad is the only editor that provides a truly fully portable full fidelity document.

Over the past few years several other fantastic collaborative editors have risen and fallen, Etherpad will continue to exist and once it finds the right home and assigns the ideal project lead it will flourish. On top of Etherpad a lot of my time was dedicated to my other projects.

My plugins will be all moving over to the Etherpad foundation Github organization on the 1st of September.

Special thanks to Automattic, Wikimedia, Mozilla, UN, Nato, Storytouch, OGF and other major individual contributors such as Peter, Rob, Egil, Aaron. I can’t thank you all enough for being part of the journey 🙂

Unsharable

Image shamefully stolen from Digital Trends..
Image shamefully stolen from Digital Trends..

Over the past 6 months I had been working mostly on projects I couldn’t share updates on. This has left my blog somewhat neglected, I apologize. I guess I can talk about the Visa Payment Ring a little now that the press has heard about it!

I haven’t been completely dormant though my Github activity has still been relatively fruitful with more people using my software than ever. I’m still active developing for the Open Gov Foundation and I hope to be more active over the coming months as their software goes live in more districts. The goal with this software is to use Etherpad to provide a better collaborative editing tool for governments to draft laws/bylaws etc. While Mozilla, the UN and Nato do this already the OGF is extending Etherpad to provide a more structured approach to legislation making the data more machine consumable and ergo representable in multiple fashions.

When it comes to open source I guess now I’m more of a package maintainer than active developer… My plugins mostly have active contributors who are contributing bugfixes and new features which means my main task can be done on the toilet (code review and merges). I see this as being a fair to decent use of my time.. I spend maybe an hour a day on this and often find myself triaging bugs in boring meetings.

Most of my hands on project work has been around R&D and due to the NFC Ring having an IP leak and a competitor abusing our IP to launch their product a lot more things have been happening behind closed doors. I feel like I shared as much as I could anyway and helped others be able to build their own NFC Rings.

One thing I didn’t share on my blog was the awesome work we did creating the worlds first payment ring for Henry Holland and Visa Europe Collab, that was a lot of fun. I got to hang out with people from the fashion and payment world which was a new experience for me, something I’m grateful for.. If VEC or Visa ever reach out to you on a project I strongly suggest getting involved because the teams are great but watch out the legal side as Visa’s legal dept seems to be somewhat disconnected from their innovation teams and this can lead to lengthy negotiations!

I’m still building new rings, that’s pretty much my full time job now and with McLear Ltd going through changes and my life going through changes writing new content here has been a low priority.. I’m not sure this will change any time soon.. The very nature of the work I’m doing means I can’t even share much if I wanted to and with me venturing into grounds where others already write way more frequently than me I’d feel it would not be a fruitful use of my time…

I wrote this before the Visa press release on the payment ring.. Obviously with that now live you guys have more of an insight to why I have been busy! Vive la finger!

LaTeX in Etherpad


Inserting and collaborating on formulas in Etherpad is really easy by simply installing the ep_mathjax plugin.

What are these things?

Etherpad is an open-source collaborative editor.
LaTeX is a high-quality typesetting system

Getting started with LaTeX and Etherpad

To install the plugin browse to http://youretherpadhere/admin/plugins and install the ep_mathjax plugin

Once the plugin is installed click the ∏ button and you can begin writing LaTeX Markup. Submitting the form will insert the LaTeX representation of your text into the pad.

Multiple Etherpad instances on one host with Docker

These Docker CLI commands will bring up multiple containers / instances on one host..

sudo docker run -d -e MYSQL_ROOT_PASSWORD=password --name ep_mysql mysql
sudo docker run -d --link=ep_mysql:mysql -p 9001:9001 tvelocity/etherpad-lite

sudo docker run -d -e MYSQL_ROOT_PASSWORD=password --name ep_mysql2 mysql
sudo docker run -d --link=ep_mysql2:mysql -p 9002:9001 tvelocity/etherpad-lite

This will bring up two unique Etherpad instances on ports 9001 and 9002. Each instance will have it’s own MySQL database.

Etherpad database tests

I was curious how much database affected Etherpad performance and with the new load test tool it was pretty easy for me to test.. I expect Redis will be the most performant as it is tuned for KVS, I expect MySQL will follow second and DirtyDB will come in somewhere behind..

The idea behind the test is to do a snapshot of usage, simulating a lot of clients and authors generating and consuming a lot of content until the server can no longer serve responses to requests in a timely fashion (100ms)…

I tested the following databases..

DirtyDB 0.6.9, standard.
Redis 2.8.4, standard.
MySQL 14.14, MYISAM db.

I used the following configs, obviously commenting out when not using a specific database:

"dbType" : "dirty",
"dbSettings" : {
  "filename" : "var/dirty.db"
}
"dbType" : "mysql",
"dbSettings" : {
  "user"    : "etherpad",
  "host"    : "localhost",
  "password": "test",
  "database": "etherpad"
}
"dbType" : "redis",
"dbSettings" : {
  "host"      : "localhost",
  "port"      : 6379,
  "database"  : 0
}

Important notes:
* I killed the Etherpad server process in between each run.
* I ran the same test 3 times and took a mean average of the results.
* The server ran on one of my cores, the testing client on the other.
* The read write ops are ~ 5:1 (read/write) but Etherpad will be holding data in memory so it’s plausible the database is hardly being touched at all.
* Node version is v0.12 – Etherpad sha is cc0eaba7e262ccc97aa0ce34b5e4d0f6eac4fd08

The command used to run the test:

etherpad-load-test

Every 5 second this test creates new users in the ratio of 4 lurkers to 1 author, the author contributes content. After 50 seconds, 10 authors will be on the pad and 40 lurkers. All of these authors will be generating content. The test continues until responses are no longer processed within 100ms.

Results

Redis
Clients Connected: 304
Authors Connected: 76
Lurkers Connected: 228
Sent Append messages: 15337
Commits accepted by server: 15310
Commits sent from Server to Client: 33638

MySQL
Clients Connected: 320
Authors Connected: 80
Lurkers Connected: 240
Sent Append messages: 16607
Commits accepted by server: 16506
Commits sent from Server to Client: 32201

DirtyDB
Clients Connected: 308
Authors Connected: 77
Lurkers Connected: 231
Sent Append messages: 15426
Commits accepted by server: 15325
Commits sent from Server to Client: 34636

*NOTE: DirtyDB WILL have severly slow startup times
after 1M rows. To simulate this I created 1M rows and ran the test again, this was the result.

Firstly how to make dirty even dirtier..

# get the current size
wc -l var/dirty.db

# do until wc -l shows ~1M
cat var/dirty.db{,} | sponge var/dirty.db 

DirtyDB 1M Rows
Clients Connected: 308
Authors Connected: 77
Lurkers Connected: 231
Sent Append messages: 15108
Commits accepted by server: 15007
Commits sent from Server to Client: 33238

Summary

Well I must say I did expect database choice to have an impact on editor performance but it doesn’t look that way under the tests I performed.. I’m going to go back to the drawing board to find a test that will should help us decide once and for all which is the most performant database for Etherpad!

My thought for a new test is:

1) Remove all Lurkers, they are just putting load on the test app and it’s the test app that’s maxing out, not the server.

2) Send messages more frequently (See 1 but also go beyond normal human rates of contribution)

Update…

Since I did the original tests I modified the load test tool to create authors only, here are the results…

Redis Authors only
Clients Connected: 180
Authors Connected: 180
Sent Append messages: 20553
Commits accepted by server: 20452
Commits sent from Server to Client: 22866

MySQL Authors only
Clients Connected: 184
Authors Connected: 184
Sent Append messages: 23045
Commits accepted by server: 22944
Commits sent from Server to Client: 20369

DirtyDB Authors only
Clients Connected: 204
Authors Connected: 204
Sent Append messages: 28555
Commits accepted by server: 28454
Commits sent from Server to Client: 21565

DirtyDB 1M+ rows Authors only
Clients Connected: 202
Authors Connected: 202
Sent Append messages: 26319
Commits accepted by server: 26218
Commits sent from Server to Client: 18200

Conclusion #2

With these tests it seems that the database doesn’t affect performance of heavy editor usage. My only assumption is that when the database gets large enough selects and puts will be slow enough to warrant a complex database. DirtyDB has obvious disadvantages of being unable to take home to your mother.