How to Secure (ReSTFul) Web Services


Application security is now one of hottest topic in IT departments. We are now fully in the age of the Service Oriented Architecture (SOA) regardless of how it has been rebranded by the marketing departments. It is the norm to build services and provided them to our clients, partners and suppliers. Retailers such as Amazon have open their door through their API to third party developers to make use of their vast resources; be it cloud computing or selling and buying items. We build services and make them available through (a well designed) application programming interface (API). There is a downside in making our API available to third parties, and that is we are making ourselves vulnerable to attackers to compromise our systems. This post aims to discuss ways to secure web services against such malicious behaviour. Remember, nothing is secured in theory but let’s make it darn hard to break it in practice. The security has to be a forethought when starting to develop a new service.

There are three layers of security to be addressed while developing APIs and making them available over the internet to third party users;

Web Services Security 3 Layer

These layers are also applicable to SOAP based web services and any network distributed systems. The above diagram shall be read bottom up. Even though the above layers look and sound similar to the OSI Model, there are not the same and should not be confused.

Transport layer

When making a service available over a network, we are utilizing the transport layer. The role of the transport layer is to ensure that third party clients can make a reliable connection to our services. Therefore, the transport layer creates a communication link between our API and consumers. In the context of web services, our communications link is created over HTTP. We all know that HTTP in its barebones is not secured but we can still observe people making their APIs available over plain HTTP. It’s possible that your company has developed its own proprietary encryption which can be added to the HTTP communication but in any cases, here are the most common ones:

Legacy network applications which communicates over the internet using HTTP were mostly secured using Secure Socket Layers (SSL). Certificates vendors branded their products as SSL certificates, therefore when the Transport Layer Security (TLS) was introduced, they kept the same name as SSL certificates. TLS is the evolution of SSL, needless to say that all APIs communications shall be done over HTTPS preferably using TLS as it has fixed some of the vulnerabilities of SSL.

Presentation Layer

Clients initiates communication with web services through the transport layer by using the web services URI. Once the connection is established; the client is then forwarded to the presentation layer where requests can be made. Depending on the system architecture and the purposes of the web services, the client can either make anonymous or authenticated calls. Protected web services calls shall be available only to authenticated parties. The most common way to block anonymous callers from making restricted calls is to force them to confirm their identity. Here are three common ways to validate third party callers:

  • Basic Authentication: the web services would request that the third party caller identifies itself by providing a valid username and password combination which would be used to create a session for the duration of the communication
  • OAuth 1 or 2: OAuth allows third party client to access users resources without sharing their credentials. It is commonly used by web services such as Facebook, Twitter and LinkedIn to authorise third party applications to log onto their sites.
  • Identity Certificate: identity certificates are, in many forms, similar to SSL certificates. The Certificate Authority (CA), which could be your company, signs and endorse the certificate on behalf of a third party. You can provide your third party with an identity certificate that you have signed with your key. This could be very secured in the same way as HTTPS. Third party clients should not self signed their certificates. As the certificates are available on the callers’ devices, if a device is stolen, this can become a security risk.

A recommendation would be to secure your web services API with Basic Authentication over HTTPS, this security approach is the most popular and tested on the internet. OAuth v1 would be recommended over v2 for transmitting highly sensitive data. OAuth suitability on commercial APIs is questionable as opposed to web site which aims is amassed a large user base.

Application Layer

This layer has been ignored in most conversation about web services security. A simple Google search on web services security would show results which only addressed the transport and presentation layer. So why would you want to discuss ‘securing the web services at the application layer’? This is as important as the two previous layers. Let’s put it into context; a third party client ‘s systems have been compromised. The attackers were able to obtain some credentials to our web services. As we do not have any application layer security in place, the attacker can connect to our services and make all sort of requests. This is a fictitious yet very probable scenario. So let’s tackle how we handle user authorisation in our web services, remember that authorisation is the process of verifying that you have access to something. Here are two possible solutions:

  1. Digital Asset Manager (DAM)
  2. Custom development

The application layer security would group, roles, domain or hierarchy level security. It is faster and cheaper to build a custom solution or procure an Open Source alternative. Regardless of the which route you may venture into, application layer security has to be implemented to have a fully secured web services.


We have discussed the three layers of web services security. Remember that; in theory nothing is secure but we should make near impossible to break it in practice. Architects have to consider all three layers when designing APIs which would be available to third parties over the internet. All web services communication shall be conducted over a secured channel such as HTTPS. The Basic Authentication can handle most authentication requests to web services and is a secure way to exchange user credentials over HTTPS. Application Layer security shall be implemented to handle authorisation to resources. Remember, code defensively to mitigate risks of a security breach.


Web Services Architecture – When to Use SOAP vs REST

SOAP (Simple Object Access Protocol) and REST (Representation State Transfer) are popular with developers working on system integration based projects. Software architects will design the application from various perspectives and also decides, based on various reasons, which approach to take to expose new API to third party applications. As a software architect, it is good practice to involve your development team lead during system architecture process.
This article, based on my experience, will discuss when to use SOAP or REST web services to expose your API to third party clients. 

Web Services Demystified

Web services are part of the Services Oriented Architecture. Web services are used as the model for process decomposition and assembly. I have been involved in discussion where there were some misconception between web services and web API.
The W3C defines a Web Service generally as:


A software system designed to support interoperable machine-to-machine interaction over a network.


Web API also known as Server-Side Web API is a programmatic interface to a defined request-response message system, typically expressed in JSON or XML, which is exposed via the web – most commonly by means of an HTTP-based web server. (extracted from Wikipedia)

Based on the above definition, one can insinuate when SOAP should be used instead of REST and vice-versa but it is not as simple as it looks. We can agree that Web Services are not the same as Web API. Accessing an image over the web is not calling a web service but retrieving a web resources using is Universal Resource Identifier. HTML has a well-defined standard approach to serving resources to clients and does not require the use of web service in order to fulfill their request.


Why Use REST over SOAP

Developers are passionate people. Let’s briefly analyze some of the reasons they mentioned when considering REST over SOAP:


REST is easier than SOAP

I’m not sure what developers refer to when they argue that REST is easier than SOAP. Based on my experience, depending on the requirement, developing REST services can quickly become very complex just as any other SOA projects. What is your service abstracting from the client? What is the level of security required? Is your service a long running asynchronous process? And many other requirements will increase the level of complexity. Testability: apparently it easier to test RESTFul web services than their SOAP counter parts. This is only partially true; for simple REST services, developers only have to point their browser to the service endpoints and a result would be returned in the response. But what happens once you need to add the HTTP headers and passing of tokens, parameters validation… This is still testable but chances are you will require a plugin for your browser in order to test those features. If a plugin is required then the ease of testing is exactly the same as using SOAPUI for testing SOAP based services.


RESTFul Web Services serves JSON that is faster to parse than XML

This so called “benefit” is related to consuming web services in a browser. RESTFul web services can also serve XML and any MIME type that you desire. This article is not focused on discussing JSON vs XML; and I wouldn’t write any separate article on the topic. JSON relates to JavaScript and as JS is very closed to the web, as in providing interaction on the web with HTML and CSS, most developers automatically assumes that it also linked to interacting with RESTFul web services. If you didn’t know before, I’m sure that you can guess that RESTFul web services are language agnostic.
Regarding the speed in processing the XML markup as opposed to JSON, a performance test conducted by David Lead, Lead Engineer at MarkLogic Inc, find out to be a myth.


REST is built for the Web

Well this is true according to Roy Fielding dissertation; after all he is credited with the creation of REST style architecture. REST, unlike SOAP, uses the underlying technology for transport and communication between clients and servers. The architecture style is optimized for the modern web architecture. The web has outgrown is initial requirements and this can be seen through HTML5 and web sockets standardization. The web has become a platform on its own right, maybe WebOS. Some applications will require server-side state saving such as financial applications to e-commerce.



When using REST over HTTP, it will utilize the features available in HTTP such as caching, security in terms of TLS and authentication. Architects know that dynamic resources should not be cached. Let’s discuss this with an example; we have a RESTFul web service to serve us some stock quotes when provided with a stock ticker. Stock quotes changes per milliseconds, if we make a request for BARC (Barclays Bank), there is a chance that the quote that we have receive a minute ago would be different in two minutes. This shows that we cannot always use the caching features implemented in the protocol. HTTP Caching be useful in client requests of static content but if the caching feature of HTTP is not enough for your requirements, then you should also evaluate SOAP as you will be building your own cache either way not relying on the protocol.


HTTP Verb Binding

HTTP verb binding is supposedly a feature worth discussing when comparing REST vs SOAP. Much of public facing API referred to as RESTFul are more REST-like and do not implement all HTTP verb in the manner they are supposed to. For example; when creating new resources, most developers use POST instead of PUT. Even deleting resources are sent through POST request instead of DELETE.
SOAP also defines a binding to the HTTP protocol. When binding to HTTP, all SOAP requests are sent through POST request.



Security is never mentioned when discussing the benefits of REST over SOAP. Two simples security is provided on the HTTP protocol layer such as basic authentication and communication encryption through TLS. SOAP security is well standardized through WS-SECURITY. HTTP is not secured, as seen in the news all the time, therefore web services relying on the protocol needs to implement their own rigorous security. Security goes beyond simple authentication and confidentiality, and also includes authorization and integrity. When it comes to ease of implementation, I believe that SOAP is that at the forefront.



This was meant to be a short blog post but it seems we got to passionate about the subject.
I accept that there are many other factors to consider when choosing SOAP vs REST but I will over simplify it here. For machine-to-machine communications such as business processing with BPEL, transaction security and integrity, I suggest using SOAP. SOAP binding to HTTP is possible and XML parsing is not noticeably slower than JSON on the browser. For building public facing API, REST is not the undisputed champion. Consider the actual application requirements and evaluate the benefits. People would say that REST protocol agnostic and work on anything that has URI is beside the point. According to its creator, REST was conceived for the evolution of the web. Most so-called RESTFul web services available on the internet are more truly REST-like as they do not follow the principle of the architectural style. One good thing about working with REST is that application do not need a service contract a la SOAP (WSDL). WADL was never standardized and I do not believe that developers would implement it. I remember looking for Twitter WADL to integrate it.
I will leave you to make your own conclusion. There is so much I can write in a blog post. Feel free to leave any comments to keep the discussion going.


Big Data Architecture Best Practices

The marketing department of software vendors have done a good job making Big Data go mainstream, whatever that means. The promise of we can achieve anything if we make use of Big Data; business insight and beating our competitions to submission. Yet, there is no well-publicised Big Data successful implementation. The question is: why not? Clearly this silver bullet where businesses have seen billions of dollars invested in but no return on investment! Who is to blame? After all, businesses do not have to publicise their internal processes or projects. I have a different view to that and the cause is on the IT department. Most Big Data projects are driven by the technologist not the business there is create lack of understanding in aligning the architecture with the business vision for the future.

The Preliminary Phase

Big Data projects are not different to any other IT projects. All projects spur out of business needs / requirements. This is not The Matrix; we cannot answer questions which have not been asked yet. Before any work begin or discussion around which technology to use, all stakeholders need to have an understanding of:

  • The organisational context
  • The key drivers and elements of the organisation
  • The requirements for architecture work
  • The architecture principles
  • The framework to be used
  • The relationships between management frameworks
  • The enterprise architecture maturity

In the majority of cases, Big Data projects involves knowing the current business technology landscape; in terms of current and future applications and services:

  • Strategies and business plans
  • Business principles, goals, and drivers
  • Major framework currently implemented in the business
  • Governance and legal frameworks
  • IT strategy
  • Pre-existing Architecture Framework, Organisational Model, and Architecture repository

The Big Data Continuum

Big Data projects are not and should never been executed in isolation. The simple fact that Big Data need to feed from other system means there should a channel of communication open across teams. In order to have a successful architecture, I came up with five simple layers/ stacks to Big Data implementation. To the more technically inclined architect, this would seem obvious:

  • Data sources
  • Big Data ETL
  • Data Services API
  • Application
  • User Interface Services
Big Data Protocol Stack

Data Sources

Current and future applications will produce more and more data which will need to be process in order to gain any competitive advantages from them. Data comes in all sorts but we can categorise them into two:

  1. Structured data – usually stored following a predefined formats such as using known and proven database techniques. Not all structured data are stored in database as there are many businesses using flat files such as Microsoft Excel or Tab Delimited files for storing data
  2. Unstructured data – businesses generates great amount of unstructured data such emails, instant messaging, video conferencing, internet, flat files such documents and images, and the list is endless. We call the data “unstructured” as they do not follow a format which will make facilitate a user to query its content.

I have spent a large part of my career working on Enterprise Search technology before even “Big Data” was coined. Understanding where the data is coming from and in what shape is valuable to a successful implementation of a Big Data ETL project. Before a single a line of programming code is written, architects will have to try and normalise the data to common format.

Big Data ETL

This is the part that excites technologists and especially the development teams. There are so many blogs and articles published every day about Big Data tools that this creates confusions among non-tech people. Everybody is excited about processing petabytes of data using the coolest kid on the block: Hadoop and its ecosystem. Before we get carried away, we first need to put some baseline in place:

  • Real-time processing
  • Batch processing
Big Data – Data Consolidation

The purpose of Extract Transform Load projects, regardless of using Hadoop or not, is to consolidate the data into a single view Master Data Management for querying on demand. Hadoop and its ecosystem deals with the ETL aspect of Big Data not the querying part. The tools used will heavily depends of processing need of the project: either Real-time or batch; i.e. Hadoop is a batch processing framework for large volume of data. Once the data has been processed, the Master Data Management system (MDM) can be stored in a data repository such as NoSQL based or RDBMS – this will only depends on the querying requirements.

Data Services API

As most of the limelight goes to the tools for ETL, a very important area is usually overlooked until later almost as a secondary thought. MDM will need to be stored in a repository in order for the information to be retrieve when needed. In a true Service Oriented Architecture spirit, the data repository should be able to expose some interfaces to external third party applications for data retrieval and manipulation. In the past, MDM were mostly created in RDBMS and retrieval and manipulation were carried out through the use of the Structured Query Language. Well this does not have to change but architects should be aware of other forms of database such NoSQL types. The following questions should be asked when choosing a database solution:

  • Is there are standard query language
  • How do we connect to the database; DB drivers or available web services
  • Will the database scale when the data grows
  • What security mechanism are in place for protecting some or whole data

Other questions specific to the project should also be included in the checklist.

Business Applications

So far, we have extracted the data, transformed and loaded it into a Master Data Management system. The normalised data is now exposed through web services (or DB drivers) to be used by third party applications. Business applications are the reason why to undertake Big Data projects in the first place. Some will argue that we should hire Data Scientists (?). According many blogs, Data Scientist roles is to understand the data, explore the data, prototype (new answers to unknown questions) and evaluate their findings. This is interesting as it reminds me the motion picture The Matrix, where the Architect knew the answers to the questions before Neo has even asked them yet and decides which one are relevant or not. Now this is not how businesses are run. It will be extremely valuable if the data scientist may suggest subconsciously (Inception) a new way to do something but most of the time the questions will come from business to be answered by the Data Scientist or whoever knows the data. The business applications will be the answer to those questions.

User Interfaces Services

User interfaces are the make or break of the project; a badly designed UI will affect adoption regardless of the data behind it, an intuitive design will increase adoption and maybe user will start questioning the quality of the data. Users will access the data differently; mobile, TV and web as an example. Users will usually focus on a certain aspect of the data and therefore they will require the data to be presented in a customised way. Some other users will want the data to be available through their current dashboard and match their current look and feel. As always, security will also be a concern. Enterprise portal have been around for a long time and they are usually used for data integration projects. Nevertheless, standards such as Web Services for Remote Portlets (WSRP) make it possible for User Interfaces to be served through Web Service calls.


This article show the importance of architecting a Big Data project before embarking on the project. The project needs to be in line with the business vision and have a good understanding of the current and future technology landscape. The data needs to bring value to the business and therefore business needs to be involved from the outset. Understanding how the data will be used is key to its success and taking a service oriented architecture approach will ensure that the data can serve many business needs.


4 Reasons Football Teams Should Use Big Data and IoT

This is a case why football teams (or soccer for my transatlantic friends) should make greater use of the Internet of Things and big data.
Here are some 5 simple reasons: 

1. Player health tracking

In 2012, then only 24 years of age, Fabrice Muamba a footballer from Bolton Wanderers suffered an heart attack on live TV during an FA Cup game. This was not the first such incident on live TV; Marc-Vivien Foe, the Cameroonian player, died on June 26 2003 during a confederation cup game against France. Not to mentioned what happens off-pitch, football club have a duty to look after their players and that’s including their health.Let’s clear, I’m not saying that current health wearable devices could have saved them but they could have played a great role. Read the signs early to avoid the disaster as prevention will always be better than the cure.So how can be big data helps us improve the athletes health? You may ask. I will simplify it for illustration purposes. Teams need to put in a place an IoT and Big Data strategy where athlete will wear various devices during training and games. Think about a heart monitoring bracelets that players can wear during training and games for the physiotherapy team to analyse later or in real time. The devices will record the time, number of foot steps (within a period – in order to differentiate between walking and running) and heart rate. Let’s say that player John Smith health is changing during the game, the physios will get alerted of the change of condition and can take appropriate actions. Maybe we can add devices to shin pads so that players do not have to wear uncomfortable devices.

2. Player positioning and technique

Great players do this naturally but not all players are born equal. I remember when I used to play football in my early teen for USO back in France. In training, we would learn how to shoot, head the ball and basic skills. The trainer would then let us know what we did wrong and how to ameliorate our techniques. Now professional players have a large team working with them daily; strikers will have coaches to help them for attacking positioning, goal keeper will have coaches too and etc… The current tech used by the teams are heavy duty machine which are employed behind close doors. How can IoT make improve on this? Again through wearable technology, we can track players movement. GPS integrated devices will track player movement over a map (football field), chin pad enabled devices will help visualise the player shooting techniques. The list of devices are only limited to your own imagination but the possibilities are endless. Players can then analyse their own data or with the help of their coach, see how they can improve. No more killing pigeons with those bad shots over the bar!

3. Team metrics

Now what is the point of all that data generated if not put to use? We have health data and game metrics (positioning and technique) about the whole team. The players health will give us an indication of who is fit for game. Being a big fan of the beautiful game, we have observed how various conditions can affect a player performance such as weather and pitch condition. The data scientists would analyse who can perform better under the current conditions. Don’t get it wrong; top players, if fit for the game, are indispensable to the team but all advices should be welcomed. You will have what we call “the best team on paper”. Gamblers use this technique in various events to pick their winners/ losers.

4. Opponents analysis

Usually we can find a wealth of information about teams from various sources such as sport sites, videos and blogs. The more you know about your opponents the better you will perform (Sun Tzu – Art of War). Team changes twice a year (transfer window) for those who can afford it. Is a team really as good as their last game? This question can be answered with big data analytics. Taking the conditions into consideration and our own team metrics, we can pick a strategy (team formation, defensive or attacking…). The big data gives (all) the information that we need in order to make decisions.


The Internet of Things (IoT) and Big Data provides an opportunity, even if not new, to gather information and make executive decisions much faster than before. There should not be any reasons not to have the full picture before hand. Football, as any other sports, is very competitive and everyone wants to create advantage. IoT devices are getting cheaper and more data can be collected for analysis. Big Data has the tools but not a silver bullet. We can prevent tragedies by tracking our players health and help them improve by analysing their techniques. Let’s realistic that winning a game is probably beyond Big Data and IoT alone but they will play a big part in shaping the future of the game.

Big Data is not a product – the idiosyncratic hype

Marketers have been building the hype around “Big Data”. Recruitment agencies have received many positions to fill around “Big Data”.  But who is to blame when you receive a call from a recruiter who believes that “Big Data” is a product?


“Where ignorance is bliss, ’tis folly to be wise.” – Thomas Grays

A recruiter who does not carry out any sort of research on the role that he’s try to fill is simply an idiot. The fact is that job specs from companies can be very brief and the recruiter job is to source the best candidate for the role. Engineers hate recruiters but they feel that they need to speak to them if they have to land that next big opportunity. I had a call from a recruiter asking to send him one of our developers. He was adamant that “Big Data” was a product. He kept on asking if our developers had experience with the product called “Big Data”. I tried to make him understand that there’s no such product that I know
of. After a while, he became aggressive and I had to end the call. I blame the recruiter for lack of research and its neglect about “Big Data” and its relevant technologies. If he did carry out some research about the topic, I believe that he could have appeared more professional.


IT Departments looking for “Big Data” practitioners have to share some of the blame. They write the job specs to be sent out to a recruitment company. You can’t send a job description for a data analyst and only have Hadoop and “Big Data” in it. You need to provide more information such as: job description and project background and then what sort of person you sought after.  Engineers are not marketers thus they have better understanding of the technologies. We know what we want and who can do the job, therefore why not facilitate the recruiter job by providing more information beforehand?
In conclusion, Big Data is not a product and next someone calls and tells me, I will put the phone down on him. Research your topic before contacting anybody about a job opportunity.

A software architect is not a senior developer

There are some IT departments till this day who believes
that by hiring a senior developer they can fulfil the role of a software
Senior developers have much knowledge about the full
software lifecycle and can be trained to be architects but are they are not. A software
architect first and foremost is a visionary. It helps that an architect has some
software development experience but most of the time, he will be exposed to a
polyglot environment. Before a single line of code is written, the architect
will have to map out how the business requirements can be translated into a
solution. This not only requires knowledge of the business environment, from
operations to infrastructure, but to present a convincing system to the
stakeholders. Requirements such as scalability, latency and security will be
missed from initial development stage if not tackled early on. Senior
development understands their team and their abilities. Senior developers can
manage workloads amongst team members and make sure that the under-development project
meets its architectural goal.
The architect will decide how a requirement should be
developed in order to meet the business requirement as an example:
The business has
offices around the globe, the business requirements require the website to be
fully loaded within 3s regardless of the location of the user and handle a
minimum load of a hundred thousand users. 
The above requirements are dealing with the architecture of
the system not that we can authenticate a user against an Oracle DB.
It is important to note that many Software Architect were
previously working as Senior Developer (such as myself) but nonetheless, many
senior developers have no interest in architecture. Choosing if a system should
use Tomcat or Glassfish and Apache Webserver for load balancing is the domain
of the Architect. Doing code review and making sure software development
pattern are well applied, is the domain of Senior Developer. A senior developer
can also choose a development methodology such as SCRUM with the approval of the
Project Manager. The architect would attend meetings with the various
stakeholders: end users, operation, infrastructure, development and testing
teams. When the business asks why is the system slow, they will turn to the
architect. The architect would then have to sit down with the lead senior
developer and review that the current development meets the architecture goals
or if there are faults in the architecture design.
I am a software architect and I can easily communicate my
vision to the development team but I am also a senior developer who still loves
hacking the machines. I worked in an architecture committee and came across
architects who have no development experience which I think it is wrong. An
architect should appreciate other development languages not to be biased toward
a single one without any merit.
I hope that more companies will appreciate the role of
software architects in software projects regardless of its size.