Bits and pieces

by Stefano Fratini

common sense developer

Page 2


Evolving CRUD (part 2)

Serving content at scale without complexity

 Command Querying Responsibility Segregation

Martin Fowler and Udi Dahan have in the past described this approach at length. This is obviously not a new concept and what follows is only a possible implementation of it.

Command Querying Responsibility Segregation CQRS

A fancy name that hides a simple concept: read traffic (GET) has different requirements compared to the write traffic (POST/PUT/DELETE) and should be handled possibly by different systems.

This is by far not a new concept. Already in production with big players but never standardised.

This design splits read and write requests according to the following philosophy:

 Read requests (GET)

  • Redis based for high 100k tps
  • Uses advanced data structures to model data and indexes
  • A proper noSQL schema design from the grounds up

 Write requests (POST/UPDATE/DELETE)

  • Data ingestion is happens in a reliable queue
  • Suggested queuing

Continue reading →


Evolving CRUD (part 1)

Serving content at scale limiting complexity

Scalability is a simple concept that proves difficult to achieve without introducing complexity

 Create Read Update Delete

CRUD stands for Create Read Update Delete and identifies all the possible type of interactions with a specific resource hosted in any type of datastore.

It’s the simplest paradigm for database access and translates easily into the world of web applications: most RESTful APIs are in fact built using this paradigm

Rightly so as it’s intuitive and simple in itself

Create Read Update Delete.png

Usually a web application exposing a CRUD interface externally does translate each interaction it to a similarly straight forward database interaction.

For example, with a RESTful API with a SQL backend we could have:

CRUD operation RESTful method SQL statement
Create POST /resources INSERT
Read GET /resources/:resourceId SELECT
Update PUT

Continue reading →


RC4, an old friend

Recently Microsoft Google and Mozilla announced they will drop support for RC4 in their browsers SSL implementation, but RC4 won’t disappear overnight from our everyday life despite its troubled history

 A closed source algorithm

RC4 is a RSA cipher developed in 1987 as a closed source algorithm and never officially released to the public.

it came under intensive scrutiny after it was leaked to the internet in 1994 and soon thereafter doubts around its robustness started circulating.

This didn’t stop Microsoft from making it the default encryption cypher for the RDP protocol in 2006 and chip manufacturers to adopt it en masse for the (WEP) from 1997 onwards

 WPA to the rescue

Not many know that WPA was implemented as 2 rounds of RC4 using time based password (TKIP). There were infact too many RC4 chips already in the market once RC4 was deemed unsafe.

WPA has been for a long time

Continue reading →


Synchronization is not an option

I recently had an interesting discussion with a java developer around multithreading

Multithreading is one of the most complex topic in Computer Science. It’s not a surprise that very few programming languages (java and c++) offer the full power (and complexity) of thread management to the developer.

bigstock-Java-Android-source-code-101015882.jpg

 Atomic assignments

Java reference assignments are always guaranteed to be atomic for

  • all assignments of primitive types except for long and double
  • all assignments of references

Many developers tend to forget the first line unfortunately

The following code is therefore not correct:

public class Config {
    private long lastUpdated
    private Map<> dataMap;
    ...
    public function getConfig() {

        if(lastUpdated < 5minAgo> // ERROR: Not synchronized
            reloadConfig();

        return dataMap;
    }

    public function reloadConfig() {
        ...
        // we

Continue reading →


Network partitions and nosql

Many distributed noSql datastores claim support for consistency or availability.

From the CAP theorem we know that a distributed system under partition (P) can only be Consistent or Available but not both at the same time

bigstock-Insurance-Concept-with-Word-on-86639783.jpg

 Partitions: Why care about them?

Because they can happen at any time, with any distributed deployment.

The Internet is not just a big LAN and some regions can at time get disconnected randomly

Network partitions can also happen within the same datacentre as well. Hardware can always fail within the same LAN (or Availability Zone) and not caring about network topology changes is simply a bad practice. Full stop.

Ideally you would want a datastore to never lose writes (CP) under any circumstance. No business is keen on loosing data after all.

It’s puzzling to see how hard this is to achieve when our datastore uses some kind distributed architecture.

Most nosql

Continue reading →


5 reasons for not using db triggers

Triggers are nice little feature that most relational DBMS offer to automatically execute code in response to certain events on a particular table or view.

Their primary aim is to maintain data integrity but they are often used to enforce arbitrary data transformations in response to changes or generic events on the database

Here are 5 reasons why you shouldn’t use them though

mysql connection code in php

 1. Triggers add programming magic to your application

Once you add triggers on a table, the results of updates to that table are no longer solely dependent on the SQL statement executed: no developer cannot tell the outcome of a INSERT/UPDATE/DELETE statement by just looking at the SQL code.

 2. Triggers add business logic in the DB

Business logic in the data layer is a bad idea: it violates the principle of separation of concerns of a multi tier architecture
Moreover your business logic is now coded in at

Continue reading →


Gentle backups with mysqldump

More often than not we are required to make backups of mysql databases running in production.

If the db size is small there are relatively little problems. Once the database starts growing in size the risk is stealing resources away from your app and hence impact on customer experience.

 1. Out of the box

Mysqldump, out of the box can be invoked as follows

mysqldump -h something.com -u user -pPassword > /backup/backup.sql

 2. Switching to repeatable-reads

If your underlying storage engine is InnoDB and your table structure is not going to change during the backup we can safely add the --single-transaction option: this way the export will run in a single transaction with repeatable-read isolation level giving you a consistent backup

mysqldump -h something.com -u user -pPassword --single-transaction > /backup/backup.sql

PS the --quick option is enabled by default since Mysql 5.1

Continue reading →


Ruby madness

Highly recommended blog post: How We Moved Our API From Ruby to Go and Saved Our Sanity

Some excerpts (emphasis added):

A year and a half in, at the end of 2012, we had 200 API servers running on m1.xlarge instance types with 24 unicorn workers per instance. This was to serve 3000 requests per second for 60,000 mobile apps. It took 20 minutes to do a full deploy or rollback, and we had to do a bunch of complicated load balancer shuffling and pre-warming to prevent the API from being impacted during a deploy.

After rewriting the EventMachine push backend to Go we went from 250k connections per node to 1.5 million connections per node without even touching things like kernel tuning.

Was the rewrite worth it? Hell yes it was. Our reliability improved by an order of magnitude.

We could downsize our provisioned API server pool by about 90%

As if that weren’t enough, the time it takes to

Continue reading →


Object oriented hell

I stumbled upon this quora answer a couple of days ago and I can’t but agree more with the points made:

Object Oriented Programming

 OO didn’t lead to better code

It’s just an abstraction like others (i.e. functional programming) to be honest. Not a magic wand that will make programs write themselves

Java and to a lesser extent C++ were used improperly leading to increased complexity instead of instead of encapsulating it when really necessary

Avoid at all costs inheritance. The only place I’ve seen inheritance work properly is with UI frameworks. So unless you are rewriting Swing or the Flash Virtual Machine don’t do it.

 An excuse for ignorance

Many Java developers end up knowing nothing about how a computer actually works. They are just trained to write classes but can’t tell you how an http call works… what is a host header… how cookies work etc

When asked to help with recruiting web developers the first

Continue reading →


Email deliverability

The Simple Mail Transfer Protocol permits any computer to send email claiming to be from any source address.
It’s been conceived for a very trusting environment, not exactly what’s the internet today…

Spam Email

 Welcome spam

Over time the amount of spam has grown substantially and with that the effort to filter that out.

The net result is that one can never expect 100% deliverability of emails.

Automatically generated emails are the most affected as:

  • they are sent on behalf of the sender by 3rd party servers (exactly like spam)
  • ISPs have learned to filter aggressively against them
  • they don’t originate from trusted domains/servers (gmail, hotmail etc)

Unless you really know what you are doing (like the guys at basecamp), never try and send emails yourself, trust the people that do that for a living and use a 3rd party service.

 What to expect

Email deliverability for automated emails can

Continue reading →