Juan de Bravo — GeistHaus

Configure MCP server with asdf for Node.js

Apr 12, 2025 Updated Apr 12, 2025

Show full content

Today I want to share how to configure an MCP (Model Context Protocol) server to run Node.js applications using asdf as the version manager. This setup is particularly useful when you need to ensure consistent Node.js versions across different environments and want to leverage asdf’s powerful version management capabilities.

The Challenge

When setting up an MCP server to run Node.js applications, you might encounter issues with Node.js version management and environment variables. The key is to ensure that the MCP server can find and use the correct Node.js version managed by asdf.

The Solution

Here’s a working configuration example for an MCP server that runs a Node.js application:

{
    "mcpServers": {
        "your-mcp-server": {
            "command": "node",
            "args": [
                "/Users/foo/projects/mcp-server/build/index.js"
            ],
            "env": {
                "PATH": "/Users/foo/.asdf/shims:/usr/local/bin:/usr/bin:/bin",
                "ASDF_DIR": "/Users/foo/.asdf",
                "ASDF_DATA_DIR": "/Users/foo/.asdf",
                "ASDF_NODEJS_VERSION": "22.14.0"
            }
        }
    }
}

Let’s break down the important parts of this configuration:

Environment Variables:
- PATH: Includes the asdf shims directory first, ensuring that asdf-managed executables are found before system ones.
- ASDF_DIR: Points to your asdf installation directory.
- ASDF_DATA_DIR: Specifies where asdf stores its data.
- ASDF_NODEJS_VERSION: Explicitly sets the Node.js version to use.
Command Configuration:
- command: Set to “node” to use the Node.js executable.
- args: Contains the path to your Node.js application’s entry point.

Why This Works

The configuration works because it:

Ensures the correct Node.js version is used by setting the ASDF_NODEJS_VERSION environment variable
Makes asdf-managed executables available by adding the shims directory to the PATH
Provides all necessary asdf environment variables for proper version management

Tips for Implementation

Make sure you have the Node.js version installed in asdf:
```
asdf install nodejs 22.14.0
```
Verify the version is available:
```
asdf list nodejs
```
Adjust the paths in the configuration to match your system’s setup:
- Replace /Users/foo with your home directory
- Update the application path to point to your Node.js application
If you’re using a different Node.js version, update the ASDF_NODEJS_VERSION accordingly

Troubleshooting

If you encounter issues:

Verify that the Node.js version specified in ASDF_NODEJS_VERSION is installed.
Check that all paths in the configuration are correct.
Ensure the asdf shims directory is properly set up.
Try running the Node.js application directly from the command line to verify it works.

This configuration provides a robust way to run Node.js applications through MCP while maintaining version consistency through asdf. It’s particularly useful in development environments where you need to switch between different Node.js versions for different projects.

For more information about MCP, check out the official documentation. For asdf, visit the asdf website or their GitHub repository.

Happy coding! 🚀

http://juandebravo.com/2025/04/12/configure-mcp-server-with-asdf/

Get Pull Request metrics with chat-gpt

Mar 27, 2023 Updated Mar 27, 2023

Show full content

I woke up early morning today and took some minutes to read this blog post about engineering delivery metrics while grabbing my early morning coffee :coffee:.

It provides good insights about how to measure the efficiency of your process from a Pull Request being opened until the code changes are shipped to the production environment.

Without going into low-level details, these are the main things that should happen during that process:

Delivery pipeline

The concepts described in that blog post are something that has been running as a background thread in my mind over the last few months, but I am always struggling to invest some time in it.

Until today! I paired with ChatGPT and I hit two targets with one shot.

I was curious about the first part of the diagram above: how long does it take for a Pull Request to get merged in my current project?

I asked:

First question

And GPT replied:

First answer

This was a good way to start our relationship for this matter :blush: It helped me to avoid:

checking which python library should I use.
check the github / python library API to understand how to obtain each pull request and its status.

I realized my question was misleading, as it interpreted that I was interested in a specific Pull Request:

Second question

And GPT replied:

Second answer

I was so impressed that I didn’t even consider testing the script, I wanted more!

Third question

For the sake of readiness, I’ve put its answer in an external gist.

And its explanation:

Third answer

It hadn’t in mind that I was curious about the last six months of activity, and it decided to query Pull Requests from a specific date (yesterday).

So I asked again:

Fourth question

And its answer in an external gist.

It described again the solution, but to be honest I didn’t read it.

At this point, I’m not sure if there’s been a big gain in productivity. Perhaps the major advantage is that I didn’t need to invest some time reviewing a couple of third-party dependencies, but at the same time, I feel I’m trusting blindly what I’m being told.

Unfortunately, things became a bit tedious:

Fifth question

It apologized for the confusion (so cute :cat:) and pointed to the mistake very fast (incredibly fast!)

Fifth answer

This got me very confused and intrigued. I could understand that ChatGPT knew about since and until attributes in the get_pulls() method if they had existed in previous versions of the PyGithub library and got deprecated for whatever reason in the last version of the library. But that was not the case, they have never existed. So I have no idea why ChatGPT decided to use since. My two theories are that it exists in other libraries and learned from that, or it extrapolated other arguments of the REST API and adapted its knowledge about the python wrapper. Quite fascinating IMO even though it was wrong this time.

This is the script associated with the response above. Here GPT made a rookie mistake. We are interested in the last 6 months of data, so as soon as one Pull Request is older than that, we can stop iterating over previous data (pull requests are order by time desc).

I got a couple of additional errors:

Sixth question

This script is interesting because it forgot to add the last line of the script:

plt.show()

This reminded me to don’t trust blindly the machine for now. So I read the script for the first time :laughing: and realized we were retrieving data from the master branch. I can understand that because it used to be the default branch name some years ago, but that’s not the case anymore :punch:.

I suggested using main as the default branch instead:

Seventh question

It was very polite, I was liking it a lot:

Seventh answer

Going back to my previous problem, it was still happening!

Eighth question

And its answer and the updated script, this time with the last line included!

Eighth answer

Here we got an issue that should be easy to spot with the usual linter in your IDE:

Ninth question

And it acknowledged the mistake and provided an updated script.

I was getting a very ugly chart with no data, so I thought maybe we should plot “hours to merge” instead of “days to merge”:

Tenth question

It explained briefly the required changes:

Tenth answer

This is the generated script

At this point, I think both of us needed a break :laughing:.

The chart was still looking pretty bad, so I did what I usually (try to) do in my day-to-day work: read the code, understand what it’s doing, add some helpful debug logs, and find the root cause of what’s going on.

Eleventh question

This was still very interesting but was not increasing my productivity anymore. This is its answer:

Eleventh answer

I was shocked :laughing: It indeed understood the issue (we were using datetime instead of date to create the buckets) but suddenly it introduced numpy (np.) in our script.

Twelfth question

This is its answer and the script using numpy.

Twelfth question

I tested it and it worked :tada:. In less than one hour we had a script to grab the Pull Requests from a specific repo and plot a histogram with the number of hours it took to merge each Pull Request.

I think the job is still not ended, but it was a good start and I needed a break, so I asked it something different to have some social time together:

Last question

I left home at 11:00 am and the cyclist arrived at 12:30 pm. I was lucky I didn’t trust it blindly.

Bicycle

http://juandebravo.com/2023/03/27/get-pull-requests-metrics-with-chat-gpt/

How to mount an external volume with write permissions

Mar 26, 2023 Updated Mar 26, 2023

Show full content

I am always struggling to figure out how to fix “read-only permissions” when plugging an external USB into your Mac. If you are on my team, this has worked for me:

Get the list of mounted volumes:

diskutil list

You will get a list of volumes, both internal (your hard disk(s)) and external (your USB).

Identify the external one with read-only permissions (e.g. /dev/disk4) and unmount it:

sudo diskutil unmountDisk /dev/disk4

In my case it’s always /dev/disk4 :man_shrugging:

Create a folder where the USB will be mounted:

sudo mkdir /Volumes/EXTERNAL_USB

Mount the USB again with write permissions:

sudo mount -w -t exfat /dev/disk4s1 /Volumes/EXTERNAL_USB

where:

-w stands for write permissions.
-t exfat is the external type. It should match the format you used when formatting your USB (e.g. exfat, nfs, msdos, etc).
/dev/disk4s1 is the disk identifier + partition to mount.
/Volumes/EXTERNAL_USB is the folder you created in step 3.

The tricky thing to remember is the external type you should use. Valid options can be found by running:

ls /sbin/mount_* | cut -c13-

And that’s all. Hope it saves time for you, and it will save time for me from now on :laughing:.

http://juandebravo.com/2023/03/26/mount-external-volume-write-permissions/

TID-X registration flow with Arengu

Mar 15, 2020 Updated Mar 15, 2020

Show full content

Last week I was working on the TID-X registration form; TID-X is a tech event I co-organize with my friends Gustavo and Alonso.

Given that TID-X is a free event, registration is a very simple process: we need to know approximate number of attendees, so we make sure we that venu capacity is not overflowed.

Registration flow

Despite its simplicity, the registration needs to include the following:

A list of all attendees including their ID details. All that is required by the security team at the hosting venue. So, we need to make sure that we collect all data without disclosing private information.
A list of attendees for the conference website. TID-X is, mainly, a social event focused on socializing with friends and colleagues but also meeting new people, so getting to know who will be attending in advance is not only a cool thing but also a useful one, if you want to make the most of your time there.

2020 will be the third TID-X edition, and in the previous ones, we (the organizers) were tackled these requirements using two separate flows:

private list for security: attendee should fill in her data in a Google Form.
public list of attendees: attendee should send a Pull Request to TID-X repository including her name and personal URL.

Having two decoupled steps is the last thing you want for something like a registration flow :blush:. Also, the Pull Request step created some friction for some attendees.

Welcome Arengu!

I was looking for options to merge these two flows into a single step, to simplify the usability and streamline the signing up for our users. TID-X web page is hosted in Github using Github Pages, and keeping the public list of attendees as part of the source code was definitely something I wanted to maintain (we’re serverless!).

And then, I found Arengu!

Arengu is a very interesting product that raised €500.000](https://tech.eu/brief/arengu-pre-seed/) some weeks ago. Its goal is to help companies build their sign-up flows with ease. I’d define it as the intersection of Auth0 and Typeform.

Arengu has two main entities:

form: usual functionalities you may expect if you’ve used Google Forms or TypeForm.
flows: build business logic on top of a form.

Going back to the our TID-X needs, list of attendees for the security team was very easy to implement. With Arengu you can view and export the form submissions.

Arengu form view submissions

As for the public list of attendees, it meant updating TID-X source code by adding the new attendee to the attendees list.

That’s a very good fit for an Arengu flow: once the user has filled in the form, if she has accepted to be included in the attendees public list, a new pull request to the source code repository with the relevant user data would be generated:

Arengu flow to open a Pull Request

Setting up a flow (business logic) was very straight forward. The HTTP Request step triggers a Github Action, which will eventually create a Pull Request for adding the attendee name to the attendees list.

Conclusion

Arengu allowed us to merge the two registration flows we had without writing any code and without deploying any logic. Very cool! 🙌🏻🚀

http://juandebravo.com/2020/03/15/tid-x-registration-flow-with-arengu/

I work remotely

Feb 22, 2020 Updated Feb 22, 2020

Show full content

I have been working in the tech domain over the last twenty years (OMG!). During that period of time, I’ve transitioned from the regular office environment to be fully remote. This is my personal experience so far, lessons learnt and things I’m still struggling with.

Disclaimer

I’m working at Jamm, a video-first collaboration experience for modern teams. Our goal is to enable not only fast, friction-free communication but also enable to collaborate meaningfully with a bunch of people regardless of location or timezone. You can request access to our private beta here.

Jamm

Note This is not a detailed explanation on the benefits of remote working but my personal experience. If you’re interested in going deeper on the subject, I recommend the book REMOTE, office not required, by @jasonfried and @dhh.

Avoid commuting to work

Living in Barcelona, if you work for a company ordinarily offices are in the city center so getting to work will not take you (or me!) more than 40 minutes; and that used to be my case (40 minutes max one way) before working from home. Over the years I used the usual means of transport:

metro: a bit crowded in the mornings, fine in the afternoons. Very uncomfortable in summer time, when the humidity in Barcelona is at high peak. Good option to carve out some time to read.
public bicycle system: it works fine, even though it’s fairly common that you need to visit a couple of stations for picking up a bicycle or finding a free slot to leave it back. Things are getting better over time though.
my own bicycle: my favorite option. I had a 25 minutes ride to the office, good enough for doing some physical exercise without sweating. The city is well equipped with a network of bicycle lanes map here.
scooter: fastest option, but that’s the only advantage compared with the others, and it’s for sure the most dangerous one.

In the best case (scooter), it would take me around 50 minutes (two ways) every day. Now that I’m working from home, I earned that time, for free. 5 days a week during 11 months per year means roughly 183 hours per year, equivalent to around 22 days work (a month!), 5000 km with my road bicycle or around 2000 km running. Every single year.

I’ve never been a fan of commuting so I’m taking this as a great improvement in my daily routine.

Get your working space right

Over the last years I’ve tried different options at home to find the right environment to work. We moved into our current apartment a couple of years ago; however it has just been this winter when I have the feeling that I found not only the room but also furniture, lighting and hardware that works for me. I tried several options before, but they were either too noisy, bright, dark or distracting.

Previously, since working from home was very rare, having a dedicated space was not so key but now, with no office to go to, it is important to find not only a suitable space to be productive, focused and, also important, being able to shut the door and separate my personal from my professional life.

Hardware

Last year I left my MacBook Pro 2017 as secondary device and acquired a PC. Not that I was planning to switch from macOS to Windows/Linux, but my perceived performance of that laptop was extremelly low and I needed a change. I started thinking about setting up a Hackintosh, and after getting some suggestions from Arturo about hardware and some help from Iván to set it up, I got it working. Find below the hardware that I used; overall I’m very happy with the outcome. Performance is much better and I’ve never got infuriated due to my computer under performing (now I can only blame myself if I’m getting stuck!):

CPU:: Intel Core i7-8700K 3.7Ghz BOX.
Motherboard:: Gigabyte Z390 Aorus Pro.
Memory: Corsair Vengeance LPX DDR4 3200 PC4-25600 32GB.
SSD: Samsung 860 EVO Basic SSD 500GB SATA3.
CPU Cooler:
Power supply: Corsair RM750X V2 750W 80 Plus Gold Modular.
Tower case: NZXT H500i.
Screen: LG 34UC99-W.
Webcam: Logitech C930e.

There’re a lot of specs about Hackintosh hardware setup, this one being the last I read.

Social time

I used to spend a lot of time hanging out with my coworkers when I was working for Telefónica. The company itself (Telefónica R&D) was around 1000 employees back then, and over the ten years I worked there I had the chance to meet a lot of great people. In my last role in the company I was leading a 30 members team in an organization of around 150, which meant a lot of interactions, both professional and personal ones. When I left the company almost three years ago, I thought the social part was going to be one the things I’d miss the most, and looking back that’s been the case. It was very common for me to hang out in a coffee break, lunch time or after work with a beer, and I do miss that! Nevertheless I’ve done several activities I would not have done if I continued working there, like joining a triathlon club (even though I don’t swim, that’s an opportunity to hang out with runners and riders :-)), attending a japanese food course or taking some painting classes (not very impressing results, as you can see :blush:).

Not very impressing results!

Family time

It’s not something that I knew would happen when I started working remotely, but my daughter was born a couple of months ago, one of the best things ever happened to me. After some weeks of parental leave, I’ve got back to work. Knowing that I’m 30 seconds away from her really makes me happier.

Avoid interruptions

Working in an office usually means you’d get tons of interruptions during a day, specially if you’re working in an open space. Working from home does not mean you won’t have any interruptions, but you can probably impact better manage those and get rid of them with ease.

Flexible working hours, but limit them

Working from home means it’s much simpler to change my working hours, and having that flexibility is great for those days when something requires my attention, or just it’s 12PM, it’s sunny and I’m willing to run for a while.

I’ve been using several daily schedules, and the one that currently works better for me is:

start working early in the morning (7 AM): that’s a requirement because our company is not only fully remote but also fully distributed. We have people in San Francisco, Sydney, Perth and Barcelona. While most of our work is asynchronous, having a daily standup helps us to sync up and hang out a bit.
take a long break at noon: something like 3-4 hours, so I can spend some time with the family, do my almost daily run routine while training for the next marathon and even take a short power nap.
work another 4-5 hours in the afternoon: key to overlap with PST timezone and catch up with the team.

Well, this is the theory of this, reality comes with longer hours at work although I am working to improve the routine :blush:.

If you want to take a single advice: if you are working from home, try to identify burnout symptoms and take actions.

Technology / Tools

These are the tools we’re using for our day 2 day work:

Jamm: Jamm is a video-first collaboration experience for modern teams. Remember the note at the beginning of this post, that’s the product we’re building.
Slack: for internal communication we don’t use email, hence we’re using Slack for any text-based communication. We also use Slack as an information hub (integrations with our cloud provider, Typeform, source code, ticketing tool, etc).
Notion: internal documentation.

Communication and expectations

Proactive communication is a key skill if you work in a physical office with your team, but it’s even more relevant when working away. To keep the team up to speed, and mitigate the effect of working in different timezones, prior to calling it day I share my notes with the rest on how the day went, issues I had or just the last joke I’ve heard around…

Under promise and over deliver is a well known quote, I’d reformulate to over and over communicate if you’re working remotely.

Work from anywhere

One of the key benefits of remote work is that I can work from anymore, given that what I just need is my computer and a good Internet connection. Why should I be limiting myself to be in my home office?

So what’s the deal?

According to State of remote work analysis by Buffer and AngelList, 98% of the surveyed would like to work remotely, ast least some of the time, for the rest of their career.

Sate of remote report 2020

And I don’t blame them! While working remotely may have some negative consequences like not being able to unplug or feeling disconnected from the team, you can impact on these consequences by building a company culture around remote work and your own experience as you gain it.

In my experience, the flexibility to work in a flexible schedule and from anywhere are the major advantages, something I’d definitely try to keep at least for some years :hammer:

http://juandebravo.com/2020/02/22/I-work-remotely/

Debug a Node.js app transpiled with Babel and running in Docker from VSCode

May 20, 2019 Updated May 20, 2019

Show full content

I’ve been playing lately with the technologies pointed out in the post title, and it took me a considerable amount of time to be able to debug it with ease. I went through several pages tackling these technologies together, but none of them was specific enough for me to get the setup working; hence I’ve decided to describe the steps I took, in case this is helpful to anyone fighting against technology.

Through this post I’ll describe the different pieces and introducing some snippets of code / configuration, but in case you prefer to get the whole picture you can clone the sample project that describes the setup.

Description

I want to debug a Node.JS application that is:

written using latest version of JavaScript, concretely using JS modules, which is part of ES6 (ECMAScript 2015) standard: I prefer using ES modules (import and export statements) instead of CommonJS require spec, so the code is written using a similar structure as web code.
transpiled using Babel: current versions of Node do not support natively ES6 modules (only through experimental flags), hence it’s required to transpile the source code to code that can be executed in Node (10.15) runtime.
executed as a Docker container: my usual preference for running applications.

My current IDE of choice is VS Code, so the expectation is to be able to define a break point in the IDE, connect the debugger to a running container and intercept any request going through the break point.

Several pieces moving around

Technical pieces Babel

Babel is an impressive project that helps you run the latest functionalities of JavaScript in any runtime. It makes it happen by compiling down features unsupported by your JS runtime to a supported version.

When using Babel, you need to compile the source code before executing your application. In the sample project there are a couple of npm scripts to launch the application (or just run make run in the root folder):

build: executes babel to compile the source code.
serve: launch the application using the compiled code generated by Babel (dist folder).

Babel functionality is based on composition, so the developer can choose which functionalities are required for the project and install the specific NPM packages (instead of integraging a huge framework). In the sample project this is the bare minimum list of Babel dependencies:

@babel/core: basic functionality, which depends on the project configuration defined in the .babelrc file.
@babel/cli: used to compile from command line instead of loading a library.
@babel/preset-env: defines automatically the required plugins/polyfills based on the runtime defined.

Nevertheless, when you are under development you want your changes to be refreshed as soon as possible, either for running unit tests or for validating the changes manually. Here is where babel-node package appears: it works similar to Node.js command line, but has a pre-step that executes the Babel compilation. This simplifies a lot the hot-reloading process.

Sourcemaps

The JavaScript code to be executed in the Docker container would differ from the source code itself due to the compilation phase.

However, as a developer you want to define a breakpoint while debugging in the original source code, as it’s loaded in your IDE (VS Code in this case).

This is where a source map comes into the scene, as it maps the compiled code to the original one, creating a way to reconstruct the source code.

Babel can be configured to generate source maps both from the CLI (–source-maps flag) or in the configuration file (.babelrc). Keep in mind though that its support in config file is limited.

Docker

Upon launching the VS Code debugger, it will open a connection to the machine where the application is running. By default the port where the Node application listens is 9229, hence it’s important to map this port from the container to localhost so VS Code can connect to it.

Nodemon

Nodemon is a utility that monitors for any changes in the predefined paths (the source code path) and automatically restarts the application. It’s installed as a development dependency via NPM.

Nodemon will run inside the Docker container, so for Nodemon to detect changes in the source code it’s important to mount a volume in the container to map the source code from the host to the container.

VSCode

VSCode provides several ways to debug a program. In this case, we’re interested in attaching to a process that is running in a remote host (a Docker container) through a local binding (port 9229, as described before). Upon adding a debug configuration, file .vscode/launch.json will be updated with the new entry. It’s important to configure the attributes that defines the local and remote source code root path, as well as the source maps location.

Defining properly the VS Code launch configuration was only of the hardest points; if you’re having issues check my configuration in the sample project.

Conclusion

The sample project includes any configuration required for the setup.

These are the big parts:

babel: several modules installed via NPM (package.json). It’s configuration relies on .babelrc file.
babel-node: it transpiles and executes the code. Any CLI argument provided to babel-node is forwarded to Node, so it’s important to include the flag –inspect to activate the inspector.
nodemon: used in NPM script start for hot-reloading the application.
docker: important to expose the debugging port (9229) and mount the source code folder from the host.
Makefile: execute the target debug for running the application listening in the debug port and ready for hot-reloading.
VS Code: Debug configuration in .vscode/launch.json file.

Once you have defined a breakpoint in your code in VS Code:

launch the Docker container executing make debug.
launch VS Code debugger.
You should see a Debugger attached log in the Docker logs stdout logs:

Debugger attached

Launch a request that goes through your breakpoint.
VS Code stops in your breakpoint.

Enjoy :-)

http://juandebravo.com/2019/05/20/debug-node-babel-docker/

How to skip a subfolder while mounting a volume in Docker

Apr 25, 2019 Updated Apr 25, 2019

Show full content

When working in a containers based environment, it’s very frequent to build Docker images that are agnostic to the environment. That way, they can be deployed in any environment, from local development to production, which gives the great advantage of preventing any change in a Docker image taht was certified internally in a testing/staging environment for deploying it into production.

However, this may create an issue while developing, where you want to test your changes as soon as possible, hence it’s convenient to skip creating a new Docker image every time a change in the source code is detected.

A common pattern is to run locally a Docker container mounting the folder where your source code is located into the folder where it’s included while generating the Docker image (assuming there’s no compilation / packaging / etc for going straight into the point :) ).

For example, let’s assume a Node.js application that keeps the source code in the root folder of the repository in the host machine and that copies it into the /usr/src/app folder of the Docker image. Mounting the source code folder can be accomplished with the following command:

docker run -v $(PWD):/usr/src/app my-app:latest

While this is very straight forward, it may create a big issue, as you don’t want, at all, to mount your node_modules subfolder into the Docker container. Some dependencies might require native code/compilation, and if your host machine does not match the one used in the Docker image, you’d get into troubles.

To prevent this, a special volume option pointing to the node_modules subfolder can be included, and this folder won’t be overwritten by the host information:

docker run \
  -v $(PWD):/usr/src/app \
  -v /usr/src/app/node_modules \
  my-app:latest

http://juandebravo.com/2019/04/25/docker-mount-volume-skip-subfolder/

Continuous Delivery to Google K8s Engine using Travis

Mar 1, 2019 Updated Mar 1, 2019

Show full content

When you start a new project one of the key points to tackle and nail down is how to automate the recurrent tasks you will be facing frequently, being those tasks mainly around continuous integration and deployment. Small investments in time around automation at the beginning (or at any time) will pay off very quickly.

In a side project I’ve been working on over the last weeks, we chose Travis CI for Continuous Integration and Google Kubernetes Engine for building our production environment (very optimistic to call it production at our stage).

Travis official documentation describes with detail how to set up your project for running tests, being them either unit or acceptance, upon opening a Pull Request in GitHub or committing a change to a specific branch.

On the other hand, Google Cloud documentation includes a detailed example about how to implement continuous delivery to App Engine using Travis CI.

However, I could not find any official documentation, blog post or even StackOverflow thread describing how to configure Travis for deploying a change to Google Kubernetes Engine when the integration phase ends with success. So this is what this post is about.

The goal

The picture below describes what I was aiming for (remember that I’m putting aside the Continuous Integration workflow):

Continuous Delivery to Google Kubernetes Engine

I’ve created this Github repo as an example. It may be useful for you in case you are considering to implement this workflow; if that’s the case, remember to update the configuration in the Makefile with your Google Cloud data. If you’re already familiar with Travis and Google Cloud, you may skip the rest of the post and go directly to that repository.

Setting up Travis

Travis setup is configured in the .travis.yml file, that should be part of the repository that keeps your source code. In this chapter we’ll go through the main aspects to consider for our scenario.

Even though we want to deploy the new version whenever there’s a new change in master branch, we should do that only if the unit/integration testing phase ends successfully. Following Travis job cycle documentation, the configuration to achieve that is the following:

branches -> only: set this configuration to master:

branches:
  only:
  - master

after_success: define the script that will tackle the deployment to GKE. We use Travis automatic env variables to ensure we don’t deploy a new version in case of the event being associated to a new Pull Request.

# Deploy web version to Google Kubernetes Engine
after_success:
- if [ "$TRAVIS_PULL_REQUEST" == "false" ]; then ./deploy-web.sh; fi

Another relevant point is to enable Docker as a service, as we need to build a Docker image and push it to Docker registry.

services:
- docker

Here you can find the .travis.yml from the example.

Setting up Google Cloud

This guide assumes you already have a Google Cloud project running a Kubernetes Engine cluster.

Storage

In order to be able to deploy a new workload to Kubernetes, it’s required to push the relevant Docker images to a Docker registry, that will be used by Kubernetes do pull them.

Even though we could use Docker hub for that, we will use Google Container Registry. This means we’ll need to grant to our script running in Travis permissions to get access to Google Storage, as this is the infrastructure where Container Registry stands.

Create Service account and key associated to it

In order to deploy a new version from Travis to Google Cloud, we need to create a service account that enables the Cloud SDK (installed in our Travis environment, as we’ll see) to authenticate with our Google Cloud project.

In the Google Cloud Project Console, open the IAM & admin page.
Go to Service accounts.
Click Create Service Account.
Enter a Service account name, such as continuous-delivery-from-travis.
Include a description for the account.
Click Create.
In the Service account permissions select box, select the following roles:
- Kubernetes Engine -> Kubernetes Engine Developer.
- Storage -> Storage Admin.
Click Create.
Click Create Key, choose JSON as Key type and press Create. A JSON file containing the generated key is downloaded to your computer.
Click Done.

Define your strategy for making available the Service Account Key in Travis

Key management is a top priority task from security point of view for any project, and good news is that any cloud provider gives you a solid solution for secrets management. I’m not tackling this point in this article, but the as this article is related to Travis and Google Cloud, those two links are the key entry points for the subject:

Deployment script

Now that we’ve Travis and Google Cloud configured, the last point to address is the script that will deploy the latest version of our software to our Kubernetes cluster. The script should:

Use gcloud command-line tool to connect to Google Cloud.
Ensure kubectl is available.
Authenticate against Google Cloud (using the key we generated and encrypted in previous steps).
Configure the project, Kubernetes cluster and Docker registry.
Build Docker images.
Push Docker images to Docker registry.
Deploy to Kubernetes cluster.

This bash script covers the steps above for the example repo.

Conclusion

Using Travis for Continuous Delivery to a Google Cloud Kubernetes Engine is quite straight forward… as long as you’re familiar with both tools :rocket:.

http://juandebravo.com/2019/03/01/travis-google-kubernetes-engine-deployment/

I have just deleted my Facebook account

Dec 26, 2018 Updated Dec 26, 2018

Show full content

TL;DR: I’ve just deleted my Facebook account, and you should think about it as well.

I removed my Facebook application from my mobile device about a year ago.

My personal journey started some months or years before, after reading The Happiness Advantage, recommended by my friend Shay Cohen.

Back then, I had a very bad habit in regards to how I managed my professional email. Lot of discussions and topics were going on via email (pre-Slack times :)), and I was checking it from my mobile as soon as I had the minimum spare time: waiting for the metro, having breakfast, walking… and more problematic: right before going to sleep and right after waking up.

Shay explains in this post how difficult breaking a bad habit is. In my case, for breaking my toxic habit I decided to remove my professional email account from my mobile before going on holidays, and more importantly, don’t reconfigure it when I got back to work. That was my personal deal and I succeeded on keeping it. At the beginning I was feeling bad and taking it as an unprofessional behavior. I just needed to realize that nobody is getting paid just by managing emails from a specific device, so eventually I felt ok and I never reconfigured it again on my mobile.

Fast forward some months, same thing was happening with Facebook (I was checking Facebook as soon as I had 30 seconds). Now that I already knew the solution, I just removed the application.

This was before the first of a series of scandals related to Facebook data usage.

Since then, I’ve accessed Facebook via the web interface only from time to time (maybe once every three months). During these months, the only functionality I’ve kept using is the friends’ birthdays reminder (automatic email sent by Facebook).

Last news about the social network have acted as the required trigger for me to delete my Facebook account. If you’re interested in how Facebook online advertising business works and their usage in regards to personal information, I recommend this post by Daniel Coloma.

So next step for me was deleting my Facebook account. This article by New York Times describes pretty well how to do it. #DeleteFacebook movement has been around for some time, and even Brian Actor, WhatsApp cofounder, stepped in to it some months ago.

I downloaded my Facebook personal data in JSON format instead of HTML, so it’s easier to work with it in case I need to.

Some points about the downloaded data:

it’s very well organized by functionality in separated folders.
it provides accurate data about when things happened in your social network.
some folders contain a file with invalid JSON format, as it contains a file with content You have no data in this section but without double quotes, hence it’s not a valid JSON string field.
it misses relevant information for me, like birthday events.

In case you also want to download your friends’ birthdays information, just follow these steps.

Bye bye Facebook. You were great in the past and hopefully you will seek and find another way to make business, so you can be trustworthy again.

http://juandebravo.com/2018/12/26/I-have-just-deleted-my-facebook-account/

How to sign a commit in git with a GPG key

Dec 2, 2018 Updated Dec 2, 2018

Show full content

In git, whenever you want to add a change to a repository, a new commit is created in the repository history.

Besides the actual changes in the repository, every commit includes several metadata fields, some of them pointing to both the author and the committer:

Author: the person who wrote the code.
Committer: the person who added the code to the repository.

    Author:     juandebravo <juandebravo@gmail.com>
    AuthorDate: Wed Nov 21 22:41:04 2018 +0100
    Commit:     juandebravo <juandebravo@gmail.com>
    CommitDate: Wed Nov 21 22:41:04 2018 +0100

99% of the times, both author and committer points to the same person; only in distributed teams working in a flow that requires both git format-patch and git apply may happen that author and committer are different (e.g. someone providing a new functionality or bug fix, the author, but not having write permissions in the repository and therefore another person, the committer, would write the commit on his/her behalf).

Both fields are configurable locally, and there’s no way you can ensure a commit uploaded to a repository hosted in a SaaS product like Github or Gitlab was authored or committed by the person defined in Author and Committer fields… unless you enter commit signature verification.

Git provides a mechanism to sign your work with a GPG key.

Working on my macOS High Sierra, these are the steps I followed to get my commits signed:

Install git and gnupg

brew install git gnupg gpg-agent pinentry-mac
echo "pinentry-program /usr/local/bin/pinentry-mac" >> ~/.gnupg/gpg-agent.conf

Generate a GPG key

gpg --gen-key

Important: include the email and user name you’d like to use while signing your commits.

Obtain the GPG key id of the key you just generated

You will need this number in the following step.

gpg --list-secret-keys --keyid-format LONG

Configure username, email and GPG in git

git config --global user.name <your-user-name>
git config --global user.email <your-email>
git config --global gpg.program gpg
git config --global commit.gpgsign true
git config --global user.signingkey <your-gpg-key-id>

Restart GPG agent

gpgconf --kill gpg-agent

Github

Once you start signing your commits, you can let the world know commits pointing to your username in GitHub were indeed committed by you (or at least by someone with access to your GPG private key!).

First of all, export the public key to ASCII armor format:

gpg --armor --export GPG <your-gpg-key-id>

Second step is uploading the public key using your GitHub key settings page:

Click on New GPG key
Add the GPG public key ASCII representation
Click on Add GPG key

Since now on, your commits will be labeled with [Verified] in GitHub repository history page

Sign git commit

http://juandebravo.com/2018/12/02/sign-your-git-commits-with-gpg/

Python 3.7: StopIteration could be your next headache

Nov 9, 2018 Updated Nov 9, 2018

Show full content

I upgraded the python interpreter in one of the components I ran in production from 3.5 to 3.7. Main reason for that was testing dataclasses functionality. Sometime ago I wrote a post about how to build a simply automatic initializer, and eventually it’s implemented in the standard library, which I believe is a great new feature!

The upgrade was quite smooth, even though I had an issue due to a non backwards compatible change in how a generator should flag it’s exhausted.

Making a long story short: use return instead of StopIteration when you want to stop sending items from your generator.

Next is the long story, things I learned while investigating the issue and the reason why I didn’t capture it before hand.

Lesson 1: Read the release notes

This may sound obvious, but…

Remember, a few hours of trial and error can save you several minutes
of looking at the README.

I am devloper (7 Nov 2018)

Python 3.7 release notes explains very clearly changes that this new version brings into python behavior, being one of them PEP-479. This PEP introduces a non backwards compatible change in how a generator behaves.

I was so focused on the new feature I wanted to test, I didn’t go over the release notes.

Lesson 2: The process

When you have a containerized environment, it’s quite straight forward to change/upgrade a specific component execution platform. We’ve been using Python 3.5 as our default runtime for a while, but I was intrigued to upgrade to newest versions and check the new functionalities provided by the language.

To be specific, I read some months ago about dataclasses and really looked to me like a nice functionality to reduce boilerplate code. Also, PEP563 - Postponed evaluation of annotations is a great new feature, in case you’re using annotations in your code.

In order to test the python interpreter upgrade, I chose as candidate an offline component which main goal is accessing gDrive API. Our library implements a tiny wrapper over the official python client library.

Find below a simplified snippet that, given a gDrive folderId, returns a generator which includes every file under that folder. As filenames are returned in several pages to prevent a very long response, we prefer using a generator instead of a list, so it’s up to the client to iterate as much as it’s needed (and therefore request additional pages under the hood):

def get_files_in_folder(folder_id, next_page_token=None):
    # Request a page
    _files, next_page_token = _get_files_page(folder_id, next_page_token)

    # Return every file included in the results
    for f in _files:
        yield f

    if next_page_token is None:
        # Flag generator is exhausted
        raise StopIteration
    else:
        # Request next page
        yield from get_files_in_folder(folder_id, next_page_token=next_page_token)

def _get_files_page(folder_id, page_token):
    results = get_service().files().list(
        pageSize=25,
        pageToken=page_token,
        fields="files(id,name),incompleteSearch,kind,nextPageToken",
        q="'{0}' in parents".format(folder_id)
    ).execute()

    return results.get('files', []), results.get('nextPageToken')

Right, probably raising a StopIteration instead of simply return was not a good idea, but this code works properly in python 3.5, but does not in 3.7 due to the change introduced in the mentioned PEP479.

Simply changing:

# Flag generator is exhausted
raise StopIteration

with

# Flag generator is exhausted
return

solved the issue.

Lesson 3: Check silent deprecation warnings

PEP-479 describes very clearly the transition plan in regards to the backwards incompatible change:

Python 3.5: Enable new semantics under future import; silent deprecation warning if StopIteration bubbles out of a generator not under future import.
Python 3.6: Non-silent deprecation warning.
Python 3.7: Enable new semantics everywhere.

As I upgraded from 3.5 to 3.7, I didn’t get any deprecation warning. Configuring PYTHONWARNINGS environment variable is a straight forward way to get this kind of warnings.

Lesson 4: Bring future default behaviour to your code

I took this opportunity to review what exactly __future__ statements tackles.

In regards to StopIteration breaking change, Python 3.5 introduced the chance to write your code in the way that is required in 3.7 (check the first point in the transition plan above). It’s really convenient to be ready for your next interpreter/library upgrade. In this specific case, all I needed to do was adding the following line on top of the gdrive API wrapper:

from __future__ import generator_stop

That way, running the code in a python 3.5 will behave as python 3.7, hence raising an error unless StopIteration is changed by return.

http://juandebravo.com/2018/11/09/python-37-stop-iteration/

Kubernetes persistent volumes on top of AWS EFS

Sep 28, 2018 Updated Sep 28, 2018

Show full content

When deploying an application in a containers based environment, one of the usual challenges to tackle is how/where to store application state if your application happens to require it, which is usually the case.

Today I was tackling one of the recurrent issues while building a web application: an user should be able to upload a file (an image in this case) and retrieve it later. This obviously means storing the file in the server side.

Files inside a container are ephemeral, there’s no guarantee how long a container will be alive, and therefore local state inside the container is not an option (it’s usually not an option if you want to scale horizontally).

In this scenario, the state is represented by a binary file. While there’re multiples solutions for this use case, using a shared disk that can be mounted by a set of nodes is what I’m more used to.

But working with kubernetes, “mounting a shared disk in a set of nodes” is not that straight forward.

In this post I’ll go through the relevant kubernetes resources for tackling this problem, and specifically how to implement it on top of AWS using an EFS storage resource.

The basic. A kubernetes Volume

A Volume is a kubernetes storage resource attached to a Pod, and it lives as long as the Pod it’s attached to does.

Those are the usual scenarios where I’ve been using a Volume so far:

mount ConfigMap data using a configMap Volume type. This is handy for creating files in the container with the ConfigMap information.
share state between containers that are part of the same Pod using an emptyDir Volume type.
mount an external block storage, like Amazon EBS or Google Persistent Disk, using awsElasticBlockStore and gcePersistentDisk respectively. This is useful if you need to store data which availability is limited to one container at a time (no need to share state between pods), so data will nb. Note that the external resources must be created before you can use them in kubernetes via the cloud provider Web console or command line tool.

While the Volume is indeed convenient for the scenarios described above, there’s a big limitation: it can be mounted only in one Pod. Therefore, a Volume is not a good solution for my scenario, where I need binary files to be available in several Pods (to scale horizontally the solution).

The advanced. A kubernetes Persistent Volume

A Persistent Volume is a cluster resource on its own and has its own lifecycle. It represents a storage resource available to any Pod created in the cluster. Not being attached to a specific node/pod is one of the main differences with a Volume.

Similar to how memory and CPU resources can be configured in a Pod specification, a Pod storage requirements (Persistent Volume) can be defined using a PersistentVolumeClaim definition. There’re two attributes that can be configured: size and access mode (read, write).

Mind the difference between these two concepts: a Persistent Volume is a cluster resource (like nodes, memory, CPU), while and a Persistent Volume Claim is a set of requirements about the storage a Pod needs.

Last but not least concept is the StorageClass kind, which is used to describe a storage resource (similar to include metadata or define several profiles). A pod storage requirements can be configured either by defining size and access mode via PVC, or by defining the needs in more abstract terms, using a StorageClass.

A Persistent Volume can be provisioned dynamically by means of a StorageClass definition (using the parameter provisioner).

Steps to mount an EFS resource in a Pod

Back to my original problem, how can I mount a disk for sharing state (binary files) between Pods?

Running a kubernetes cluster in AWS, it seems like EFS is the natural choice.

Those are the steps I went through:

1. Create an EFS resource and make it available to kubernetes nodes using aws cli

An EFS resource can be created executing the following command:

aws efs create-file-system --creation-token efs-for-testing

The response is a JSON payload including a field named FileSystemId, which represents the unique identifier that should be used to manage the EFS volume. Let’s assume the FileSystemId is fs-testing.

EFS creation is an asynchronous process, and before managing it you need to make sure its LifeCycleState is available. The EFS state can be checked as follows:

aws efs describe-file-systems --file-system-id fs-testing

Once the EFS is available, next step is creating a mount target associated to it. A mount target acts as a virtual firewall, defining a subnet and a security group that is granted permissions to mount the EFS volume.

For creating the mount target you need the subnet-id and security-groups associated to your kubernetes cluster nodes. Usual scenario is that every node will share the same security group, while subnet id will differ based on the Availability Zone where the node is located:

aws ec2 describe-instances --filters &lt;your-filters-to-retrieve-k8s-nodes&gt;

Per each SubnetId and SecurityGroupId execute the following command:

aws efs create-mount-target \
--file-system-id fs-testing \
--subnet-id {SubnetId} \
--security-groups {SecurityGroupId}

2. Deploy the EFS provisioner

A Kubernetes deployment includes, by default, several Persistent Volume types, like AWSElasticBlockStore, GCEPersistentDisk, AzuleFile and NFS for naming a few. Each of them defines a specific provisioner that can be used to create a PV.

Furthermore, the kubernetes incubator external-storage repository holds additional Persistent Volumes that are not part of a Kubernetes default deployment, and here I found the answer to my specific need: the EFS provisioner.

The EFS provisioner is a deployment that runs a container with access to the AWS EFS resource. It acts as an EFS broker, allowing other pods to mount the EFS resource as a PV.

These are the definitions I used for deploying the EFS provisioner, even though you can find a very similar definitions in kubernetes-incubator github repository:

---
apiVersion: v1
kind: ConfigMap
metadata:
  name: efs-provisioner
data:
  file.system.id: '<<your-efs-id>>'
  aws.region: '<<your-region-id>>'
  provisioner.name: mycompany.com/aws-efs

---
kind: Deployment
apiVersion: extensions/v1beta1
metadata:
  name: efs-provisioner
spec:
  replicas: 1
  strategy:
    type: Recreate
  template:
    metadata:
      labels:
        app: efs-provisioner
    spec:
      containers:
        - name: efs-provisioner
          image: quay.io/external_storage/efs-provisioner:latest
          env:
            - name: FILE_SYSTEM_ID
              valueFrom:
                configMapKeyRef:
                  name: efs-provisioner
                  key: file.system.id
            - name: AWS_REGION
              valueFrom:
                configMapKeyRef:
                  name: efs-provisioner
                  key: aws.region
            - name: PROVISIONER_NAME
              valueFrom:
                configMapKeyRef:
                  name: efs-provisioner
                  key: provisioner.name
          volumeMounts:
            - name: pv-volume
              mountPath: /persistentvolumes
      volumes:
        - name: pv-volume
          nfs:
            server: <<your-efs-id>>.efs.<<your-region-id>>.amazonaws.com
            path: /

kubectl apply -f efs-provisioner.yaml

3. Define the StorageClass kind

StorageClass is used as an intermediate step for connecting a PersistentVolumeClaim with a specific storage resource:

metadata.name field is used to refer to the resource.
provisioner is used to identify the provisioner (EFS provisioner in this case).

Important: An StorageClass definition cannot be updated.

---
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: aws-efs
provisioner: mycompany.com/aws-efs

kubectl apply -f storage-class.yaml

4. Define the PersistentVolumeClaim

The PVC definition connects access mode and size requirements with a specific StorageClass item. In this case, as EFS has unlimited storage, the size requested won’t have any real impact.

---
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: efs
spec:
  storageClassName: aws-efs
  accessModes:
    - ReadWriteMany
  resources:
    requests:
      storage: 1Gi

kubectl apply -f pvc.yaml

As soon as you create the PVC, the EFS provisioner will get notified and will create a PV that matches the requirements. These are the EFS provisioner logs showing the PV creation:

I0928 11:03:45.897983       1 controller.go:987] provision "default/efs" class "aws-efs": started
I0928 11:03:45.900711       1 event.go:221] Event(v1.ObjectReference{Kind:"PersistentVolumeClaim", Namespace:"default", Name:"efs", UID:"2b56b224-c30e-11e8-abf5-023d3cfc37fe", APIVersion:"v1", ResourceVersion:"52345195", FieldPath:""}): type: 'Normal' reason: 'Provisioning' External provisioner is provisioning volume for claim "default/efs"
I0928 11:03:45.950090       1 controller.go:1087] provision "default/efs" class "aws-efs": volume "pvc-2b56b224-c30e-11e8-abf5-023d3cfc37fe" provisioned
I0928 11:03:45.950116       1 controller.go:1101] provision "default/efs" class "aws-efs": trying to save persistentvvolume "pvc-2b56b224-c30e-11e8-abf5-023d3cfc37fe"
I0928 11:03:45.956467       1 controller.go:1108] provision "default/efs" class "aws-efs": persistentvolume "pvc-2b56b224-c30e-11e8-abf5-023d3cfc37fe" saved
I0928 11:03:45.956498       1 controller.go:1149] provision "default/efs" class "aws-efs": succeeded
I0928 11:03:45.956643       1 event.go:221] Event(v1.ObjectReference{Kind:"PersistentVolumeClaim", Namespace:"default", Name:"efs", UID:"2b56b224-c30e-11e8-abf5-023d3cfc37fe", APIVersion:"v1", ResourceVersion:"52345195", FieldPath:""}): type: 'Normal' reason: 'ProvisioningSucceeded' Successfully provisioned volume pvc-2b56b224-c30e-11e8-abf5-023d3cfc37fe

And now you can retrieve both PV and PVC using kubectl:

kubectl get pv
NAME                                                        CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS    CLAIM         STORAGECLASS   REASON    AGE
persistentvolume/pvc-2b56b224-c30e-11e8-abf5-023d3cfc37fe   1Gi        RWX            Delete           Bound     default/efs       aws-efs              4m

kubectl get pvc
NAME                        STATUS    VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
persistentvolumeclaim/efs   Bound     pvc-2b56b224-c30e-11e8-abf5-023d3cfc37fe   1Gi        RWX            aws-efs        4m

5. Create a Deployment with 2 replicas and mount the Volume

Pods get access to the PV storage by defining the claim as a volume in the Pod definition. Claims must exist in the same namespace as the pods using the claim (StorageClass and PersistentVolume are global kinds in the cluster).

The snippet below is a basic Deployment example with 2 pods mouting a volume using a PVC. Each Pod will generate a single file in the shared folder and check that the folder has additional files, which would reflect that indeed the other Pod has created its file.

---
apiVersion: apps/v1beta1
kind: Deployment
metadata:
  name: test-efs
spec:
  replicas: 2
  strategy:
    type: Recreate
  template:
    metadata:
      labels:
        app: test-efs
    spec:
      restartPolicy: Always
      containers:
      - name: test-pod
        image: gcr.io/google_containers/busybox:1.24
        command:
          - "sh"
        args:
          - '-c'
          - 'touch "${MEDIA_PATH}/${MY_POD_NAME}"; echo "File created, waiting a bit to ensure the other Pod had the time as well"; sleep 5; [[ $(ls -l "$MEDIA_PATH" | wc -l) -gt 1 ]] && (echo "Both pods generated the file!" && exit 0) || (echo "Unable to create both files in the shared folder" && exit 1)'
        env:
          - name: MY_POD_NAME
            valueFrom:
              fieldRef:
                fieldPath: metadata.name
          - name: MEDIA_PATH
            value: "/var/media/uploads"
        volumeMounts:
          - name: efs-pvc
            mountPath: "/var/media/uploads"
      volumes:
        - name: efs-pvc
          persistentVolumeClaim:
            claimName: efs

Checking the Pods logs we can see that the scenario is successfully validated:

kubetail --selector app=test-efs
Will tail 2 logs...
test-efs-546d6d7456-2fvgp
test-efs-546d6d7456-2gqx6
[test-efs-546d6d7456-2fvgp] File created, waiting a bit to ensure the other Pod had the time as well
[test-efs-546d6d7456-2gqx6] File created, waiting a bit to ensure the other Pod had the time as well
[test-efs-546d6d7456-2fvgp] Both pods generated the file!
[test-efs-546d6d7456-2gqx6] Both pods generated the file!

Conclusions

While not very obvious, once you interiorize the concepts around persistent storage, having a shared folder mounted in several Pods in a kubernetes cluster running in AWS is quite straight forward.

http://juandebravo.com/2018/09/28/aws-efs-kubernetes/

Impact of Amazon Prime in my purchase habits

Sep 9, 2018 Updated Sep 9, 2018

Show full content

Summertime is always a great opportunity for me to read more frequently than usual. This August I’ve read two interesting books related to data analysis in python:

In Personal Finance with Python: using pandas, Requests and Recurrent last chapter, the author talked about times series forecasting and how he used Prophet to estimate his future expenses in Amazon.

The book referenced a CSV file including the author’s Amazon purchases since 2012, hence I guessed there might be a simple way to download a CSV report from Amazon including every purchase the logged in user did in the website. As I’ll explain in the following paragraphs I was wrong and it wasn’t that easy, but in any case that chapter gave me a hint to explore something I was curious about: What was the impact of becoming an Amazon Prime customer in my Amazon purchases habits?

No matter if you’re working with big or little data, the major steps you need to follow for data analysis can be summarized as follows (if you’re a data analyst and don’t agree with these steps, you’re probably right!):

Data gathering: this step focuses on obtaining the data from one (in case you’re lucky) or more (the real world) sources.
Data normalization: in case you have more than one source of data or data structure, a normalization phase is required (or at least recommended!).
Data munging: “The difference between data found in many tutorials and data in the real world is that real-world data is rarely clean and homogeneous”. Great sentence by Jake VanderPlas in Python Data Science Handbook. He is right :-)
Data analysis and plot: execute the right algorithm and plot the data in a graph that will help you to understand the meaning of the data (extract information out of the data).

In order to analyse the impact of Amazon Prime, I went through these four steps, which I’ll describe in the following paragraphs.

Data gathering

I googled “Amazon purchases CSV report” and apparently https://www.amazon.es/gp/b2b/reports was the place I was looking for (mind the .es domain).

Amazon Spain CSV report

Well… a web form not using a single language is not what you’d expect from a trillion dollar valuation company.

I filled in the form to download the data and… uppss!! It didn’t work. So I went to the amazon.com site (instead of amazon.es), as I assumed there was a temporal issue in the Spain web portal. In amazon.com things indeed got better, now I had a web form with a single language (English) and I was able to download my CSV report.

Amazon US CSV report

Wait!! The report included data only from 2011 to 2012. I was certainly sure I had purchased more stuff in Amazon in the following years, so something was wrong there.

Suddendly I realized it was some years ago when Amazon launched amazon.es website, and since then I was not able to buy anymore in amazon.com (at least if the same stuff was available in amazon.es). Apparently single sign on works like a charm between amazon.com and amazon.es, but each data report is available only in the website where the purchase was done, and after 2012 all my purchases had happened in amazon.es. Building global products is hard!!

I was able to review in Amazon Spain website the purchases I did since 2012, but as mentioned at the beginning I was not able to download an automatically generated report.

I decided to generate the CSV report manually. I’m not a really big Amazon customer so eventually it didn’t take me more than one hour to generate it.

Now I was ready for the first interesting step.

Data normalization

Cool, so now I had two files, juan-amazon-us.csv and juan-amazon-es.csv, each of them holding a report with my purchases in a single Amazon online store.

Next step was data normalization. I was specifically interested in:

filter out fields from the CSV that I considered private data, so I could upload to the wild Internet the CSV report (in case someone is interested in playing with the Jupyter notebook).
currency normalization: US items were purchased in dollars, while ES items were in euros. I decided to convert euros to dollars.
merge both files: generate a single source for loading and analysing the data.

While python and a library like pandas might be a good fit for tackling the topics above, I preferred to implement this step directly as an script on top of bash. I have spent a considerable amount of time during my last year in the bash console, so I felt pretty confident about it. I did a couple of tests with the usual bash commands (cat, cut, awk, …) but eventually I decided to give a try to jqlite.

jqlite is a wrapper on top of sqlite implemented by my colleague @drslump, which provides a SQL interface on top of tabular data (CSV, TSV, JSON).

It looked like a perfect match for the scenario I was interested in:

SELECT for fetching those fields I was ok to expose, and filtering out those I was not.
CASE…END for currency conversion.
UNION ALL for merging both files.

I came up with the following query:

echo "SELECT date
      , orderId
      , CASE WHEN currency = \"€\" THEN amount*1.1 ELSE amount END as amount
     FROM (
         SELECT \"Order Date\" as date,
                \"Order ID\" as orderId,
                substr(\"Item Total\", 2) as amount,
                substr(\"Item Total\", 1,1) as currency
         FROM (
            SELECT * FROM SPAIN UNION ALL SELECT * FROM USA
         )
     )" \
| jqlite "data/juan-amazon-es.csv@SPAIN" "data/juan-amazon-us.csv@USA"

At this point, my data was filtered, normalized and combined in a single file, ready to move to the python and pandas world.

Data munging

With unnecessary fields dropped out and data normalisation done in the previous step, now it was time to model the data to be able to plot it with ease.

Pandas is a really convenient library for working with time series, and I really enjoyed learning about it with Python Data Science Handbook.

The steps I followed to model the data were:

Load the CSV into memory and create a Panda DataFrame with the columns (date,orderId,amount).
Reduce purchase time granularity from day to month.
Set the month column as Pandas DataFrame index.
Generate a Pandas Series having as index a time series from the minimum month when I made a purchase till now, and zeros as value.
Calculate the number of purchases and total amount per month.
Calculate the accumulated spent in every month.
Fill NaN values with the relevant data.

You can find the actual code in the github repository that contains the Jupyter notebook.

Data analysis and plot

I became an Amazon prime customer in June 2017. In the following graphs you can see:

the total amount spent from 2011 till 2018.
the number of orders, per month.

Vertical line shows the exact moment when I became an Amazon Prime user.

Amazon Puchases analysis

Conclusions

Even though the total amount spent grew considerable after me becoming an Amazon Prime customer, I think the most relevant conclusion is that after that date (June 2017) I’ve purchased in Amazon at least one item at almost every month.

So even though Amazon Prime price in Spain is (was) cheap, it seems like a big win for Amazon to subsidize it.

Last, but not least, I’ll put some effort in reducing the amount of purchases I do in Amazon in favor of local shops.

http://juandebravo.com/2018/09/09/impact-of-having-amazon-prime/

Smudge and clean your git working directory

Dec 2, 2017 Updated Dec 2, 2017

Show full content

Some years ago I learned about smudging and cleaning data in a git repository, a very cool git functionality that I haven’t used for a while till last week, when it came back to my mind for solving a specific need.

Smudge and clean are two functionalities that can be configured via git attributes. While git attributes are not a very popular git configuration, it provides several handy functionalities:

identify files as binary files: this can be convenient to skip those files in git show and git diff output (by default, git does not know how to compare binary files).
configure a tool/script for diffing binary files: relevant in case you don’t want to skip binary files while checking changes in your working directory but instead be able to show a meaningful diff information. e.g. In case you want to compare two versions of a video binary file, you could use exiftool to obtain metadata information from the two versions and compare it.
define merge strategies for specific files/paths.
configure your preferences for archiving your repository.
update files automatically upon checkout/commit: it’s possible to change automatically the file content right after running git checkout or right before running git commit. This is the functionality I will focus on.

Smudging your checkout

Let’s say you need a secret token in your code inside the config.h file, you want to commit that file to the git repository, but you don’t want it to contain sensible information.

Simplest solution would be to create a placeholder (e.g. {your-secret-token}) that should be updated with the right value (e.g. super-secret-value!) upon checkout:

const std::string secretToken = "{your-secret-token}";

# After checkout, the line should be updated to:
const std::string secretToken = "super-secret-value!";

As soon as you update the file to include your secret value, the file will be shown as modified in the git status command.

Thanks to the smudge filter, you can:

automate the process, so the file will be updated automatically upon checkout.
prevent the file change to be part of the git diff output.

You need two steps for configuring a smudge filter:

Identify in .gitattributes file the files that should be processed by the filter. In this simple example we want to process the config.h file using a filter named updateSecretToken:

# .gitattributes file
config.h filter=updateSecretToken

Define the smudge filter updateSecretToken, that will substitute the placeholder with your secret. This can be done in the git global configuration:

git config --global filter.updateSecretToken.smudge 'sed "s/{your-secret-token}/super-secret-value!/"'

Next time you checkout the config.h file, its content will be automatically updated and your secret will be included in the file.

Cleaning your commit

As much as you need to update the config.h file content upon checkout to include your secret, it’s important as well to update the content before including the file in the stage area, and put back {your-secret-token} instead of super-secret-value!. Otherwise, your secret would be exposed.

For doing it, you can define the clean filter updateSecretToken. Similar to the smudge filter, this can be done in the git global config:

git config --global filter.updateSecretToken.clean 'sed "s/super-secret-value!/{your-secret-token}/"'

Smudge and clean filters

http://juandebravo.com/2017/12/02/git-filter-smudge-and-clean/

WebRTC. Improve the setup time with Chrome 59

Apr 1, 2017 Updated Apr 1, 2017

Show full content

WebRTC architecture relies on ICE protocol for checking connectivity between peers and choosing the best interface for them to communicate. The basic idea behind ICE is that each peer in the communication has a variety of candidate transport addresses (combination of IP address and port for a particular transport protocol) it could use to communicate with the other peer. ICE helps on gathering the candidate transport addresses, check connectivities and choose the best path for the communication. Gathering candidates require a time that can be critical to reduce as much as possible to provide a good user experience.

Last week my colleague Gustavo reviewed some slides I was preparing for a lecture about WebRTC in Telefonica R&D.

In one slide, I mentioned that setLocalDescription API indirectly controls the candidate gathering process. Gustavo raised that the ICE candidates gathering can start before calling setLocalDescription, as it’s defined in the JSEP protocol:

JSEP applications typically inform the JSEP implementation to begin ICE gathering via the information supplied to setLocalDescription, as this is where the app specifies the number of media streams, and thereby ICE components, for which to gather candidates. However, to accelerate cases where the application knows the number of ICE components to use ahead of time, it may ask the implementation to gather a pool of potential ICE candidates to help ensure rapid media setup.

While Gustavo was right and the specification indeed mentioned that possibility, in my tests I never was able to have such behaviour, as the candidates gathering happened only after calling the setLocalDescription.

But there are good news for the near future :relaxed: : Chrome 59 implements the iceCandidatePoolSize member of RTCConfiguration, that instructs the RTCPeerConnection instance to gather, as a performance optimization, ICE candidates before calling setLocalDescription. No news about when Firefox will implement this functionality.

In a simple demo that you can find here, the time required for getting the user media, creating the peer connection and completing the candidates gathering is reduced from 221ms to 180ms, around 19% reduction.

Reducing as much as possible the call setup time is a key factor for providing a good user experience, and being able to start the connectivity checks before creating the offer will indeed help on that.

http://juandebravo.com/2017/04/01/improve-setup-time-webrtc/

WebRTC. Chrome 57 to set multiplexing policy to require by default

Feb 15, 2017 Updated Feb 15, 2017

Show full content

Real-time Transport Protocol (RTP) includes two separate components:

the data transfer protocol itself (RTP).
a control protocol associated to the data (RTCP, where C stands for Control).

The RTP protocol specification states that the “underlying protocol MUST provide multiplexing of the data and control packets, for example using separate port numbers with UDP”.

Due to the complexity that using two different ports derives (mainly due to NAT traversal), RFC-5761 provides “an alternative to demultiplexing RTP and RTCP using separate UDP ports, instead using only a single UDP port and demultiplexing within the application.”

That’s cool, how is this related to WebRTC?

WebRTC media negotiation is based on Session Description Protocol (SDP), following the offer/answer model. To indicate the desire to multiplex RTP and RTCP packets, the a=rtcp-mux attribute is used. This is (a partial) example of a SDP offer obtained in TU Go:

[...]
a=rtcp:9 IN IP4 0.0.0.0
a=extmap:1 urn:ietf:params:rtp-hdrext:ssrc-audio-level
a=sendrecv
a=rtcp-mux
a=rtpmap:111 opus/48000/2
a=rtcp-fb:111 transport-cc
a=fmtp:111 minptime=10;useinbandfec=1
a=rtpmap:106 CN/32000
a=rtpmap:105 CN/16000
a=rtpmap:13 CN/8000
a=rtpmap:110 telephone-event/48000
a=rtpmap:112 telephone-event/32000
a=rtpmap:113 telephone-event/16000
a=rtpmap:126 telephone-event/8000
a=candidate:3455144793 1 udp 1685987071 95.23.241.104 62622 typ srflx raddr 192.168.1.131 rport 62622 generation 0 ufrag 9Che network-id 1 network-cost 10
[...]

In the example above, the offerer supports RTCP multiplexing.

In the same way, the SDP answer should include as well the a=rtcp-mux attribute to accept RTCP multiplexing:

[...]
c=IN IP4 91.220.9.62
t=0 0
m=audio 25384 UDP/TLS/RTP/SAVPF 111 110 106
a=rtpmap:111 opus/48000/2
a=fmtp:111 useinbandfec=1; minptime=10
a=rtpmap:110 telephone-event/8000
a=rtpmap:106 CN/8000
a=ptime:20
a=rtcp-mux
a=rtcp:25384 IN IP4 91.220.9.62
a=candidate:9711301045 1 udp 659136 91.220.9.62 25384 typ host generation 0

That’s cool, how is this related to WebRTC?

WebRTC NAT traversal is based on Interactive Connectivity Establishment methodology (ICE). While gathering candidates:

if RTP and RTCP are sent on separate ports, connectivity checks are required for both components.
if RTP and RTCP are multiplexed on the same port, only one connectivity check is required.

Negotiation for using RTP/RTCP multiplexing is done using a=rtcp-mux, a=rtcp:... and a:candidate ... lines in both offer and answer.

Bottom line, multiplexing RTP and RTCP reduces ICE overhead, as it requires gathering less candidates and it reduces the overhead in other parts of the VoIP architecture, e.g. less UDP ports used in the TURN servers and less bandwidth wasted in connectivity checks.

Cool, so what’s next?

PeerConnection constructor allows the application to specify global parameters for the media session. Among them, the application can specify its preferred policy regarding use of RTP/RTCP multiplexing using one of the following policies:

negotiate: includes the a=rtcp-mux attribute in the SDP, while gathering candidates for both RTP and RTCP.
require: includes the a=rtcp-mux attribute in the SDP, and gather only RTP candidates. This cuts by half the number of candidates that the offerer needs to gather.

As defined in Javascript Session Establishment Protocol, “the default multiplexing policy MUST be set to require”.

In reality, the default multiplexing policy has always been ‘negotiate’ in both Chrome and Firefox. But that’s changing in the near future.

Starting in Chrome M57, the rtcpMuxPolicy setting has gone from negotiate to require. It should not create an issue if you’re doing peer to peer communications, as browsers should be able to negotiate the RTP/RTCP multiplexing capability. But it can indeed create issues if you’re using a WebRTC Gateway or expecting RTCP candidates in your code for any reason. Nimble Ape describes an issue with Asterisk interoperatibility, as Asterisk does not support RTCP multiplexing.

May you need to keep the old behaviour, you can explicitly set the policy to negotiate while creating the PeerConnection object.

But please be aware that there’s an intent to deprecate the negotiate option. Chrome 58 will print a deprecation message if the rtcpMuxPolicy negotiate is used. JSEP says “Implementations MAY choose to reject attempts by the application to set the multiplexing policy to “negotiate”, so be warned :smiley_cat:.

What about Firefox? Firefox is still using the negotiate rtcpMuxPolicy default value, and there’s no way to specify the require option while creating the PeerConnection object. I’ve opened a ticket in bugzilla for tracking it.

Happy media negotiation! :tiger:

http://juandebravo.com/2017/02/15/rtcp-mux-in-webrtc/

Docker and how to get access to insecure registries

Jan 22, 2017 Updated Jan 22, 2017

Show full content

I just realized 2016 passed by and this blog didn’t get any update. I hope you were not worried about me! I’ve been fine! Just a bit busy :construction_worker:.

Last two weeks I’ve been playing a bit with Docker. In general before getting your feet wet with a new technology, it’s convenient (required?) to either go through the usually great documentation about the project or follow a tutorial about it. I did the later, and as a result of it I pushed my first two docker images to Docker Hub. Nothing really impressive, but it helps you to go through the basics.

Last week I started a side project and, for it to be more interesting, I decided to go with Docker and Kubernetes :neckbeard:. It turns out that the project cannot be public, so I needed to use an internal Docker hub we’re using in my current project for keeping it private.

Unfortunately, that private Docker hub is configured to accept only HTTP requests, instead of HTTPS. I know… bear with me please :pray:.

After tagging the image, I was trying to push to the docker hub repository and getting this response:

fish> docker push <insecure-docker-hub-hostname>/<image-name>:<image-tag>
The push refers to a repository [<insecure-docker-hub-hostname>/<image-name>]
Get https://<insecure-docker-hub-hostname>/v1/_ping: dial tcp <IP-address>:443: getsockopt: connection refused

Console information shows that docker is trying to connect via HTTPS to docker hub.

For overcoming this and get access via HTTP, you need to do the following:

If you’re using Mac OSX Docker client:

Go to Docker -> Daemon -> Basic -> Insecure registries
Add to the list
Restart Docker

If you’re using a Linux distribution:

Open file /etc/sysconfig/docker
Add INSECURE_REGISTRY="--insecure-registry=<insecure-docker-hub-hostname> "
Restart Docker

Now you’re ready to work with your insecure Docker hub!

fish> docker push <insecure-docker-hub-hostname>/<image-name>:<image-tag>
The push refers to a repository [<insecure-docker-hub-hostname>/<image-name>]
...

http://juandebravo.com/2017/01/22/docker-insecure-registry/

A virtual machine for WebRTC development

Nov 23, 2015 Updated Nov 23, 2015

Show full content

As described in the WebRTC project main webpage, WebRTC is a free, open project that provides browsers and mobile applications with Real-Time Communications (RTC) capabilities via simple APIs. Or in case you prefer the Wikipedia definition, WebRTC (Web Real-Time Communication) is an API definition drafted by the World Wide Web Consortium (W3C) that supports browser-to-browser applications for voice calling, video chat, and P2P file sharing without the need of either internal or external plugins.

WebRTC implements three basic new APIs:

getUserMedia: it represents synchronized streams of media.
RTCPeerConnection: it handles stable and efficient communication of streaming data between peers.
DataChannel: it enables peer-to-peer exchange of arbitrary data.

iswebrtcreadyyet lists the different WebRTC functionalities that are supported by the latest version of the most widely used browers.

WebRTC APIs are still in early stages, which means that new browser versions could introdude non backwards compatible changes, and APIs name may be prefixed by a browser prefix (moz, webkit, …), which eventually means the JS developer is in charge on building a shim to support multiple browsers in her webRTC application.

WebRTC team has created a shim called adapter as an effort for hiding those API changes and prefix differences from developers.

In case you want to get started with adapter development, you will need a Debian box to ensure you can run the multi-browser tests.

In order to make this process simpler, I have put some effort on creating a Vagrant box called webrtc-box, that is in change of automating the development environment setup for working on WebRTC adapter.

Running adapter tests should be as simple as:

# Clone repository
host $ git clone https://github.com/juandebravo/webrtc-box.git
host $ cd webrtc-box
host $ vagrant up

# Connect to guest machine
host $ vagrant ssh

# Execute tests
vagrant@debian $ cd /adapter
vagrant@debian /adapter $ npm test

And that’s it, you’re ready to start hacking on top of adapter.

Happy webrtc-ing!!! :city_sunrise: :squirrel:

In case you read this post before November 30th 2015, you’re on time for making a donation to our Movember team!!! :man:

http://juandebravo.com/2015/11/23/webrtc-adapter-vagrant-box/

Demystifying coroutines in python

Jul 6, 2015 Updated Jul 6, 2015

Show full content

Coroutines are special functions that differ from usual ones in four aspects:

exposes several entry points to a function. An entry point is the line of code inside the function where it will take control over the execution.
can receive a different input in every entry point while executing the coroutine.
can return different outputs as response to the different entry points.
can save control state between entry points calls.

Python implements coroutines starting in Python 2.5 by reusing the generator syntax, as defined in PEP 342 - Coroutines via Enhanced Generators.

Generator syntax is defined in PEP 255 - Simple generators. I covered briefly the generator functionality in a previous post. The basic usage of generators is creating an iterator over a data source. For example, the function in the snippet below returns a generator that iterates from a specific number down to 0, decreasing an unit in every iteration.

def countdown(n):
    print "Counting down from %s to 0" % n
    i = n
    while i >= 0:
        yield i
        i = i - 1

>>> for i in countdown(5):
>>>     print i
...
Counting down from 5 to 0
5
4
3
2
1
0

In the example above, the keyword yield is used to return a new value in every iteration while consuming the generator. It’s interesting to note that a generator can be consumed only once, opposite to a list that can be consumed/iterate as much as needed. A generator is considered exhausted upon being consumed the first time.

PEP 342 takes advantage of the keyword yield for pointing out entry points where the function/coroutine will receive inputs while being executed. Let’s see a very simple example of a coroutine that concatenates every string inserted by the user from the command line:

def concatenate(_str):
    """
    Coroutine that receives a new string in every
    iteration and concatenates to the original one
    """
    temp = None

    while temp != '':
        # Wait for a new input (suspend the coroutine)...
        temp = yield
        # ... and save control state (resume the execution)
        _str += temp
        print _str

# Instantiate a new coroutine...
a = concatenate('foo')

# ... and "move" the coroutine state till the `yield` keyword
a.next()

while True:
    try:
        # Send the raw input from the user to the coroutine...
        a.send(raw_input())
    except StopIteration:
        # ... and capture the coroutine end by means
        # of StopIteration exception
        break

What is really interesting is how the coroutine execution is suspended and resumed by means of the yield keyword, allowing the program flow to be moved from the coroutine to the external program and back to it.

python coroutine.py
bar
    foobar
bazz
    foobarbazz

    foobarbazz

May you be interested in additional information about this topic, I recommend going through Daviz Beazley slides regarding generators and coroutines.

As a side project I have implemented a tiny library called washington that exposes a chainable API for building a coroutines stream. I had a lot of fun while digging into the implementation, even though the real usage of the library is expected to be very limited :blush:

Happy coroutining!!! :bicyclist: :fireworks:

http://juandebravo.com/2015/07/06/demystifying-coroutines-in-python/

Don't use YYYY in your date format template

Apr 10, 2015 Updated Apr 10, 2015

Show full content

yyyy is the pattern string to identify the year in the SimpleDateFormat class.

Java 7 introduced YYYY as a new date pattern to identify the date week year.

An average year is exactly 52.1775 weeks long, which means that eventually a year might have either 52 or 53 weeks considering indivisible weeks.

Using YYYY unintendedly while formatting a date can cause severe issues in your Java application.

As an example:

    import java.util.Date;
    import java.text.SimpleDateFormat;

    public class DataExample {
        public static void main(String args[]) {
            try {
                String date_s = "2014-12-31";
                SimpleDateFormat dt = new SimpleDateFormat("yyyy-MM-dd");
                Date d = dt.parse(date_s);
                SimpleDateFormat dt1 = new SimpleDateFormat("YYYY");
                System.out.println("And the year is... " + dt1.format(d));
            } catch (Exception e) {
                System.out.println(e);
            }
        }
    }

The snippet above prints “And the year is… 2015”, because 2015 week year started on 29/12/2014.

This issue seemed to be the root cause of the massive outage that Twitter suffered last year.

So double check if you really need to use YYYY while formatting your date, and in case of doubt… Better call Saul!!

http://juandebravo.com/2015/04/10/java-yyyy-date-format/

Hanoi. Toggle functionalities in python

Feb 15, 2015 Updated Feb 15, 2015

Show full content

Upon deploying a new version of your product into production, it’s usually handy to enable the new functionalities only to a subset of users. This allows to measure the impact of your code changes in a controlled way:

Does the new functionality have an unexpected impact in server performance that was not detected in stress testing lab?
Is the new functionality increasing session duration, global product usage, customer satisfaction, etc?
Whas is the impact on the client side? Is it creating a battery drain or any other unexpected issue?

Feature toggling can be handy as well to reduce the impact of dependencies among components. If a new functionality in component A requires a new version of component B, the new functionality in component A can be toggle off till the new B version reaches production, then the deployment pipeline is loosely coupled.

Over my last Christmas holidays I spent some time visiting Vietnam, and back then I read about rollout gem, a ruby library that implements feature toggling using Redis as backend. proclaim python port is kind of outdated so I decided to build another python port, that has been named as hanoi.

Hanoi can be used for the following scenarios:

Enable/disable globally a functionality (toggle on/off)
Enable a functionality to a percentage of users, increasing the percentage gradually to ensure server and client behaviour.
Enable a functionality to specific users (whitelist users)

Currently three BackEnd are implemented (a memory based backend and REDIS backend in two different flavors). My expectation is to include additional BackEnds in the future to support additional storages, such as memcached and mongoDB.

For additional information about hanoi check the documentation in the github repository.

Happy deploying! :hammer:

Hanoi

http://juandebravo.com/2015/02/15/toggle-func-in-python/

Validating Arity in JavaScript

Jan 23, 2015 Updated Jan 23, 2015

Show full content

Last week I was discussing with my pals @drslump and @ladybenko about a very simple idea I came up with while reading John Resig Secrets of the JavaScript Ninja book on Christmas holidays.

The idea is very simple: ensure that a function is called with the expected number of parameters.

A function defined like:

var fullName = function fullName (name, surname) {
    return name + ' ' + surname;
};

could be called with:

zero parameters: name and surname will be undefined.
one parameter: the parameter will be assigned to name, surname will be undefined.
two parameters: the former parameter will be assigned to name, the latter to surname.
three or more parameters: the first two parameters will be assigned to name and surname respectively, the next ones could be accessed via arguments.

You might want to ensure the function is always called with two parameters, so scenarios like this one won’t happen:

fullName("Foo");
'Foo undefined'

Let’s do the magic by defining a method in the Function prototype:

Function.prototype.validateArity = function validateArity () {
    var fn = this;
    return function () {
        if (arguments.length === fn.length) {
            return fn.apply(this, arguments);
        } else {
            throw new Error("Arity was <"+arguments.length+"> but expected <"+fn.length+">");
        }
    };
};

Simply adding to our previous function the following:

var fullName = function fullName (name, surname) {
    return name + ' ' + surname;
}.validateArity();

// Correct call
console.log(fullName("Foo", "Bar"));
Foo Bar

// Incorrect call
console.log(fullName("Foo"));
Error: Arity was <1> but expected <2>

Protip: adding a function to Function prototype is usually NOT a good idea.

http://juandebravo.com/2015/01/23/validate-arity/

Using travis-ci to build this blog

Nov 22, 2014 Updated Nov 22, 2014

Show full content

This blog is built using Jekyll as static code site generator and GitHub as the deployment target thanks to its support to Jekyll via GitHub pages.

One of the great things about using GitHub together with Jekyll is the support that GitHub provides out of the box to generate your HTML pages using Jekyll in the server side whenever a new content is added to your repository.

Some months ago I ran some experiments to include emojis support to this blog :smile:. Jekyll supports a plugin mechanism to extend the framework functionality.

Bad news are that GitHub does not allow third-party plugins to build your Jekyll based web page, like this one. The reason behind this might be to avoid any kind of attack from a malicious user. Then I needed to either build the site locally and upload the HTML version to GitHub, or get rid of emojies.

I have been working on local compiling over the last months, but some weeks ago I started playing with travis to remove this boring work from my side. Those are the relevant steps that I did:

Create a new account in Travis
Configure your GitHub project to support travis. Here you can find the list of supported services & webhooks (JSON format).
Install the travis gem in your local environment

gem install travis

Generate a valid token in the GitHub project settings page.
Hash the token for travis to be able to use it in a secured way (as the value will be included in your repository)

travis encrypt -r <USER>/<REPOSITORY> GH_TOKEN=<GH-TOKEN> --add env.global

Configure your build in the .travis.yaml file at the root of your repository.
In order to use a GitHub OAuth token, the configured URL should be the HTTPS one:

git remote set-url origin https://${GH_TOKEN}@github.com/juandebravo/juandebravo.github.com.git

Run the command travis-lint to validate the file syntax.

Travis is a great piece of software that you can take advantage of very easily!! :bowtie: :neckbeard:

http://juandebravo.com/2014/11/22/using-travis-ci-to-build-this-blog/

Registering subclasses in python and ruby

Mar 12, 2014 Updated Mar 12, 2014

Show full content

Some years ago while I was contributing to the awesome Open Source project Adhearsion, we decided to split the logic contained in 1.0 version into different modules that might be loaded by the developer when needed. Adhearsion 1.0 was tightly coupled with gems like activerecord and activesupport, which were not required for the framework basic functionality and did not provide any real value to most of the Adhearsion applications. Decoupling the logic allowed developers to include only the required dependencies in their applications.

As result of this exercise, different gems were developed to provide isolated functionalities, like adhearsion-activerecord builds the bridge to use ActiveRecord in an Adhearsion application.

Besides the primary goal of decoupling logic, developers should be able as well to create their own modules to extend the base functionality provided by Adhearsion. Those modules were called plugins.

While thinking about how to build this plugin functionality, I dug into different libraries trying to find a clean solution, and Rails came to the rescue. The developer that is using Rails can define her own plugins to extend the functionality of the framework or modify its initialization. These plugins are called Railties. Creating a Railtie is really simple:

inherit from Railtie class.
load your class during the Rails boot process.

But… how does Rails know that it should execute the code defined in the Railtie subclass while boosting itself? The trick is based on a cool feature from the Ruby language, which defines a hook in the parent class that is raised every time the class is inherited. Here you can see the snippet of code that builds the Railtie magic.

Eventually, for Adhearsion plugins I followed the same rule.

Find below two snippets of code that registers a list of subclasses in both ruby and python.

class Plugin

  class << self

    def inherited(base)
      registry << base
    end

    def registry
      @registry ||= []
    end

    def each(&block)
      registry.each do |member|
        block.call(member)
      end
    end

  end
end

class Foo < Plugin
end

Bar = Class.new Plugin

puts "Plugin subclasses: " + Plugin.each(&:to_s).join(', ')

class Registry(type):

    def __init__(cls, name, bases, dct):
        if not hasattr(cls, 'registry'):
            # Parent class
            cls.registry = []
        else:
            # Child class
            cls.registry.append(cls)
        super(Registry, cls).__init__(name, bases, dct)


class Plugin(object):
    __metaclass__ = Registry


class Foo(Plugin):
    pass

Bar = type('Bar', (Plugin,), {})

print "Plugin subclasses: " + ", ".join([item.__name__ for item in Plugin.registry])

The output of both scripts is:

Plugin subclasses: Foo, Bar

Happy coding! :kissing_cat:

http://juandebravo.com/2014/03/12/registering-subclasses-in-python-and-ruby/

Class and static methods in python

Mar 8, 2014 Updated Mar 8, 2014

Show full content

Last week we had a nice conversation during a python code review about when class and static methods should be used, or either they should not be used at all.

Find below my opinions around this topic, feel free to comment and bring some discussion :satisfied:.

In general, you should think about moving a class or static method to the module that holds the class definition. As you may well know, functions are first class citizens in python:

class Foo(object):

    VALUES = ('1', '2', '3')

    def __init__(self, bar):
        self.bar = bar

    @staticmethod
    def get_values():
        return VALUES

This snippet can be refactored to:

def get_foo_values():
    return Foo.VALUES

class Foo(object):

    VALUES = ('1', '2', '3')

    def __init__(self, bar):
        self.bar = bar

Some exception that might apply:

The class provides several staticmethod to validate attributes, retrieve class information, etc. Moving all these methods might fill up the module with too many definitions that are clearly tight to a specific class.
The module holds several classes
You want to define an auxiliar method to create an instance, and you want the childs to be able to override the initialization (think if you should use composition instead of inheritance!):

PROPERTIES = {'name': 'john',
              'surname': 'doe',
              'age': 28,
              'email': 'john@pollinimini.net'}


class Foo(object):

    @classmethod
    def from_properties(cls, properties):
        ins = cls()
        for k, v in properties.iteritems():
            ins.k = v
        return ins

    def __str__(self):
        return ', '.join(self.__dict__.keys())


class Bar(Foo):

    @classmethod
    def from_properties(cls, properties):
        ins = super(Bar, cls).from_properties(properties)
        ins.deferred = True
        return ins

print Foo.from_properties(PROPERTIES)
print Bar.from_properties(PROPERTIES)

The output of the program is:

age, surname, name, email
deferred, age, surname, name, email

Bar includes an additional attribute to the instance, but the preliminar instance initilization is similar to what Foo does.

There are other alternatives to the class method in this case though:

defining a factory class to return the proper class.
define a function that receives as parameter a function to be applied while defining the instance:

PROPERTIES = {'name': 'john',
              'surname': 'doe',
              'age': 28,
              'email': 'john@pollinimini.net'}


def create_foo_with_function(properties, func=None):
    ins = Foo.from_properties(properties)
    if func:
        func(ins)
    return ins


class Foo(object):

    @classmethod
    def from_properties(cls, properties):
        ins = cls()
        for (k, v) in properties.iteritems():
            ins.__setattr__(k, v)
        return ins

    def __str__(self):
        return ', '.join(self.__dict__.keys())


# Create a Foo instance
print create_foo_with_function(PROPERTIES)

# Create a Bar instance
print create_foo_with_function(PROPERTIES,
                               lambda ins: ins.__setattr__('deferred', True))

Happy coding! :kissing_cat:

http://juandebravo.com/2014/03/08/python-class-and-static-methods/

Building an automatic initializer in python

Jan 4, 2014 Updated Jan 4, 2014

Show full content

One of the cool things of Scala is that, in general, you don’t need to write a lot of boilerplate while doing things in the “normal way”.

As an example, while defining a class which constructor requires one or more arguments, there’s no need to assign the parameters to instance attributes, this is done out of the box by the compiler while creating the Java code:

class Foo(val x: String, val y: String)

val value = new Foo("bar", "bazz")
println(value.x)
//> bar

Python, like many other languages, does not behave that way, therefore you must assign the attributes to instance variables. Someone would argue that this is indeed the pythonic way to work, making things as explicit as possible. Letting aside that the method __init__ is not a constructor, you can do the assigment with the following snippet of code:

class Foo(object):

    def __init__(self, x, y):
        self.x = x
        self.y = y

value = Foo("bar", "bazz")
print value.x
// bar

It turns out that a high percentage of the time, the only thing you may need to do in your __init__ methods is assigning the parameters to instance variables. Being a common behavior, it seems to me like a nice chance to build a decorator :smile_cat:. Here it goes:

import inspect

def autoinit(fn):
    co_varnames = fn.func_code.co_varnames
    kwa_defaults = inspect.getargspec(fn).defaults

    def _wrap(*args, **kwargs):
        self, nargs = args[0], args[1:]
        names = co_varnames[1:len(nargs)+1]
        kwa_keys = co_varnames[len(nargs)+1:]
        nargs = dict((k, nargs[names.index(k)]) for k in names)
        # Add the keyword arguments
        for k in kwa_keys:
            nargs[k] = kwargs[k] if k in kwargs else kwa_defaults[kwa_keys.index(k)]

        # Set the values to the instance attributes
        for k, v in nargs.iteritems():
            setattr(self, k, v)

        return fn(*args, **kwargs)

    return _wrap

The function that builds the decorator (autoinit) is doing simple things:

retrieve the parameter names and the default keyword parameter values
build a function, which is the value returned by the function autoinit, which will inspect both args and kwargs while creating a new instance object, retrieve the actual value for every parameter, and assign them to instance attributes.

An usage example:

class Foo(object):

    @autoinit
    def __init__(self, x, y, a=10, b=100):
        pass

f = Foo(1, 2, 4)

print f.x, f.y, f.a, f.b
// 1 2 4 100

http://juandebravo.com/2014/01/04/python-auto-initializer/

Encoding exercise in python

Sep 11, 2013 Updated Sep 11, 2013

Show full content

Unicode and encodings is always a fun thing. This script encodes an input string using different encodings and shows the output length:

# -*- coding: utf-8 -*-
import sys
 
if len(sys.argv) > 1:
    code_points = [unicode(c, 'utf-8') for c in sys.argv[1:]]
else:
    # Testing values
    code_points = [u'\U0001F37A\U00000045\U0000039B', u'\U0001F37A']
 
def handle_encoding(encoding, code_point):
    try:                                                                        
        values = ['{:>15}'.format(encoding),                                      
                  ' ---> ',                                                     
                  ':'.join('{0:x}'.format(ord(c)) for c in                      
                  code_point.encode(encoding)),                                   
                  ' (', str(len(code_point.encode(encoding))), ')']               
        print ''.join(values)                                                   
    except Exception as ex:                                                     
        values = ['{:>15}'.format(encoding),                                      
                  ' ---> ',                                                     
                  'Unable to encode the codepoint in {0}'.format(encoding)]       
        print ''.join(values)  
 
for code_point in code_points:
    print '{:>15}'.format('character') + ' ---> ' + code_point
    print '{:>15}'.format('code points') + ' ---> ' + repr(code_point)
    for coding in ('ascii', 'latin-1', 'utf-8', 'utf-16', 'utf-16be', 'utf-16le'):
        handle_encoding(coding, code_point)

Example:

python encoding.py "OLA KE ASE"
      character ---> OLA KE ASE
    code points ---> u'OLA KE ASE'
          ascii ---> 4f:4c:41:20:4b:45:20:41:53:45 (10)
        latin-1 ---> 4f:4c:41:20:4b:45:20:41:53:45 (10)
          utf-8 ---> 4f:4c:41:20:4b:45:20:41:53:45 (10)
         utf-16 ---> ff:fe:4f:0:4c:0:41:0:20:0:4b:0:45:0:20:0:41:0:53:0:45:0 (22)
       utf-16be ---> 0:4f:0:4c:0:41:0:20:0:4b:0:45:0:20:0:41:0:53:0:45 (20)
       utf-16le ---> 4f:0:4c:0:41:0:20:0:4b:0:45:0:20:0:41:0:53:0:45:0 (20)

Happy encoding :monkey:

http://juandebravo.com/2013/09/11/encoding-exercise-in-python/

JavaScript, perform multiple operations in parallel

Mar 31, 2013 Updated Mar 31, 2013

Show full content

I’ve spent bits of my spare time over the last weeks improving my JavaScript skills. I’ve read Effective JavaScript: 68 Specific Ways to Harness the Power of JavaScript, by David Herman; I highly recommend this book to any who wants to dig deeper into this language.

Last Thursday my pal @rafeca raised an interesting question: how could we start two or more asynchronous operations in JavaScript and execute a callback upon all of them are finished, but not before?

One of the latest chapters of the mentioned book comes up with the answer: store responses in an ordered array and execute the callback when every response has been received (there are no pending operations).

Let’s see an specific example: retrieve a set of user profiles from a third party database and print the result in an HTML table only when all of them have been received.

In the following five steps we’ll figure out how to resolve this:

1.- Create the HTML skeleton to print the user profiles

<body>
    <table>
        <thead>
            <tr>
              <th class="name">Name</th>
              <th class="profile">Profile</th>
            </tr>
        </thead>
        <tbody id="profiles"></tbody>
    </table>
    <div id="error">
    </div>
</html>

2.- Create a simple function to retrieve user profiles

For the sake of simplicity, I’ve mocked the profile database using an in-memory dictionary:

var Profile = function(name, profile) {
    this.name = name;
    this.profile = profile;
};

var ProfileDB = (function() {
    var users = {
        john: new Profile("john", "manager"),
        mark: new Profile("mark", "developer"),
        thomas: new Profile("thomas", "QA")
    };

    var _getUserProfile = function(user, callback, error) {
            var profile = users[user];
            if(typeof profile === "undefined") {
                error("profile " + user + " not found");
            }
            else {
                if(callback) {
                    // simulate a random delay between 0 and 1 seconds
                    setTimeout(callback.bind(null, profile), 1000 * Math.random());
                }
                else {
                    return users[user];
                }
            }
        };

    return {
        getUserProfile: _getUserProfile
    };

})();

3.- Create a function to retrieve a set of users profile and execute a callback upon every profile retrieval

In this step we’re building the function that will start in parallel the required operations and execute the relevant callback (success or error in case of any failure):

function getUserProfiles(users, onsuccess, onerror) {
    // number of pending operations
    var pending = users.length;
    // store results in this array
    var result = [];
    if (pending === 0) {
        // execute callback if users is empty
        setTimeout(onsuccess.bind(null, result), 0);
    }
    users.forEach(function(user, i) {
        ProfileDB.getUserProfile(user, function(profile) {
            if(result) {
                result[i] = profile;
            }
            pending--;
            if(pending === 0) {
                // every profile has been retrieved, execute callback
                onsuccess(result);
            }
        }, function(error) {
            // execute error callback
            onerror(error);
        });
    });
}

4.- Create the client to retrieve a list of user profiles

Step 4 creates the client that will use the function created in step 3. We need to provide callbacks for both success and error scenearios. Upon success, users profile are printed in the HTML skeleton built in step 1. In case of error, the specific message is shown in the div element:

var userElement = function(name, profile) {
    var el = document.createElement("tr");
    var _name = document.createElement("td");
    _name.appendChild(document.createTextNode(name));
    var _profile = document.createElement("td");
    _profile.appendChild(document.createTextNode(profile));
    el.appendChild(_name);
    el.appendChild(_profile);
    return el;
};

(function() {
    getUserProfiles(["john", "mark", "thomas"], function(profiles) {
        var fragment = document.createDocumentFragment();
        profiles = profiles.forEach(function(profile) {
            fragment.appendChild(userElement(profile.name, profile.profile));
        });
        document.getElementById("profiles").appendChild(fragment.cloneNode(true));
    }, function(error) {
        var el = document.getElementById("error");
        el.innerHTML= "";
        el.appendChild(document.createTextNode(error));
    });
})();

5.- Run it

In this case we’re updating the DOM just once, upon retrieving the three users profile. This doesn’t provide a high advantage, but if we’re retrieving hundreds of elements, updating the DOM in any response may reduce significantly our application performance :squirrel:.

http://juandebravo.com/2013/03/31/javascript-perform-multiple-operations-in-parallel/

SBT run: choose automatically the App to launch

Dec 30, 2012 Updated Dec 30, 2012

Show full content

I like Christmas because, beside resting from work and having fun with family and friends, usually there is time to learn something new. During this Xmas I’ve been playing with Scala: First, trying to finish the Coursera Functional Programming Principles course. Later, working a bit in a personal project. Better late than never :smile

As for my personal project, it provides more than one executable entry point:

web application exposing a REST API using Scalatra.
offline script to retrieve data from a third party periodically. This could be easily done with Rake in the Ruby world, and as far as I know there’s not a task management tool “de facto standard” in python.

Working with Scala and SBT, the command sbt run looks like the natural alternative. It seeks for every Scala Object in the project that could be used as the assembly entry point:

an object that defines a main method
an object that inherits from App

If your application has more than one object fitting the previous requirement, the command sbt run will ask for your help to finish the execution.

Let’s consider the following snippet of code, having two objects that define a main method:

# File src/main/Foo.scala
object Foo {
    def main(args: Array[String]) = println("Hello from Foo")
}

# File src/main/Bar.scala
object Bar extends App{
    println("Hello from Bar")
}

When you execute the sbt run command, the following text shows up:

> sbt run

Multiple main classes detected, select one to run:

 [1] Bar
 [2] Foo

Enter number: 2
[info] Running Foo
Hello from Foo
[success] Total time: 29 s, completed Dec 30, 2012 11:36:28 PM

It requires human action (in the previous example, fill in the number 2), as the run command does not receive any parameter to automate the process.

Fortunately, there’s an easy solution using the SBT plugin sbt-start-script :squirrel:. You just need to follow these three steps:

Create (or update) the file project/plugins.sbt, including:

addSbtPlugin("com.typesafe.sbt" % "sbt-start-script" % "0.6.0")

Create (or update) the file build.sbt, adding:

import com.typesafe.sbt.SbtStartScript
seq(SbtStartScript.startScriptForClassesSettings: _*)

Execute:

sbt update
sbt start-script

As result, a new file target/start is created. A file that requires the main class name to be executed as the first argument:

> target/start Foo
Hello from Foo

> target/start Bar
Hello from Bar

Two last tips:

In case your program just has a single main class, the script does not require any argument.
Remember to add the automatically generated file target/start to your CVS

http://juandebravo.com/2012/12/30/sbt-run-select-main-class/

About git internals

Dec 1, 2012 Updated Dec 1, 2012

Show full content

Last week, the first Telefonica Digital Developers Conference was held in Madrid. A two day event that gathered the Spain based developer team. 48 hours of lectures and constant knowledge sharing.

My talk, “git internals”, highlighted how git repository works when storing relevant information.

http://juandebravo.com/2012/12/01/git-internals-lecture/

Emoji support in jekyll

Nov 17, 2012 Updated Nov 17, 2012

Show full content

While writing my last post I seeked for information about how to include plugins in Jekyll. Jekyll repo wiki describes how easy is to write and hook specific logic to your Jekyll site.

Nonetheless, if you are using Github as hosting to deploy your Jekyll site, you cannot use plugins :worried:.

1. Include the gemoji dependency in your Gemfile

gem 'gemoji', :require => 'emoji/railtie'

2. Add a configuration attribute in the _config.yml file

This folder will contain the emoji icons.

emoji:    gfx/emoji

3. Write a rake task into the Rakefile

This rake task copies the icons included in the gemoji gem into your Jekyll site folder. It also generates a CSS file.

desc 'Generate emoji CSS styles'
task :emoji do
  puts green 'Generating emoji CSS...'

  require 'jekyll'

  site = Jekyll::Site.new(Jekyll.configuration({}))

  path = site.config['emoji']

  if !path.empty? and !File.exists?("#{path}") and !File.exists?("#{path}/smiley.png")
    Dir::mkdir path

    _css = %[.emoji {
  width: 20px;
  display: inline-block;
  text-indent: -2000px;
}

]

    Dir["#{Emoji.images_path}/emoji/*.png"].each do |src|
      FileUtils.cp src, path
      *_, file = src.split("/")
      *emoji_name, _ = file.split(".")
      _css += %[.emoji_#{emoji_name.join(".")} {
  background:url("/#{path}/#{file}") no-repeat scroll 0 center transparent;
  background-size: 20px auto;
}

]
    end

    File.open "css/emoji.css", 'w+' do |file|
      file.write _css
    end
  end
  puts green 'Done!'
end

4. Execute the rake task

rake emoji

Now you can check the generated CSS file that defines a specific style per emoji icon and the png files (the emoji icons) copied into the configured folder.

5. Include the generated CSS file into HTML layouts

<link rel="stylesheet" href="/css/style.css">

6. Write a plugin that converts the emoji tags in HTML tags

Copy this content into the file _plugins/emoji.rb

require "gemoji"

module Jekyll
  module EmojiFilter

    def emojify(content)
      if @context.registers[:site].config['emoji']
        content.to_str.gsub(/:([a-z0-9\+\-_]+):/) do |match|
          if Emoji.names.include?($1)
            "<span class='emoji emoji_#{$1}'>#{$1} emoji</span>"
          else
            match
          end
        end
      else
        content
      end
    end # emojify

  end # EmojiFilter
end # Jekyll

Liquid::Template.register_filter(Jekyll::EmojiFilter)

7. Emojify your content!

Concat the filter emojify in the layouts where you want to include emojies.

<div id="post" role="main">
  <header>
  <h2><a class="fadedlink" href="/" title="Home">&laquo;</a> CSS changes
    <small>28 Oct 2012</small></h2>

  <nav>
    <ul class="clearfix">
      
      <li><a href="/categories/index.html#css">css</a></li>
      
    </ul>
  </nav>
</header>

<div id="post" role="main">
  <p>Today I’ve been changing a bit the CSS that creates the layout of this site. Almost the whole style was defined by my pal <a href="http://www.twitter.com/rafeca">@rafeca</a> while creating <a href="http://www.rafeca.com">his blog</a>. I have borrowed it :smile:.</p>

<p>Although I am not an expert on CSS, I like hacking a bit of this :squirrel:.</p>

<p>I summarize the changes in the following sections.</p>

<h2 id="bottom-links-showing-a---character">Bottom links showing a “-“ character</h2>

<p>I think this is a common issue in a lot of sites: when you define a link containing an image, it may appear an annoying hyphen character on the right side of the image when the cursor is hover it. It’s been happening in this blog on the footer links.</p>

<p>To change this behavior, remove the <strong>img</strong> element and define the image as background, ensuring the text inside the <strong>a</strong> element is indented outside the screen, far away from the visible divs.</p>

<figure class="highlight"><pre><code class="language-html" data-lang="html">    <span class="nt">&lt;footer&gt;</span>
      <span class="c">&lt;!-- Before --&gt;</span>
      <span class="nt">&lt;a</span> <span class="na">class=</span><span class="s">"fadedlink"</span> <span class="na">href=</span><span class="s">"https://github.com/juandebravo"</span><span class="nt">&gt;</span>
        <span class="nt">&lt;img</span> <span class="na">src=</span><span class="s">"/gfx/github-logo.png"</span> <span class="na">alt=</span><span class="s">"@github"</span><span class="nt">&gt;</span>
      <span class="nt">&lt;/a&gt;</span>
      <span class="c">&lt;!-- Now --&gt;</span>
      <span class="nt">&lt;a</span> <span class="na">class=</span><span class="s">"fadedlink footer_link github"</span> <span class="na">href=</span><span class="s">"https://github.com/juandebravo"</span><span class="nt">&gt;</span>
        @github
      <span class="nt">&lt;/a&gt;</span>
    <span class="nt">&lt;/footer&gt;</span></code></pre></figure>

<figure class="highlight"><pre><code class="language-css" data-lang="css"><span class="nc">.footer_link</span> <span class="p">{</span>
  <span class="nl">width</span><span class="p">:</span> <span class="m">25px</span><span class="p">;</span>
  <span class="nl">display</span><span class="p">:</span> <span class="n">inline-block</span><span class="p">;</span>
  <span class="nl">text-indent</span><span class="p">:</span> <span class="m">-1000px</span><span class="p">;</span>
<span class="p">}</span>

<span class="nc">.github</span> <span class="p">{</span>
  <span class="nl">background</span><span class="p">:</span> <span class="sx">url("/gfx/github-logo.png")</span> <span class="nb">no-repeat</span> <span class="nb">scroll</span> <span class="m">0</span> <span class="nb">transparent</span><span class="p">;</span>
<span class="p">}</span></code></pre></figure>

<h2 id="increase-body-font-size">Increase body font size</h2>
<p>This one is the easiest :sweat_smile:.</p>

<figure class="highlight"><pre><code class="language-css" data-lang="css"><span class="nt">body</span> <span class="p">{</span>
  <span class="nl">margin</span><span class="p">:</span> <span class="m">0</span><span class="p">;</span>
  <span class="nl">line-height</span><span class="p">:</span> <span class="m">1.4</span><span class="p">;</span>
  <span class="c">/* Before */</span>
  <span class="nl">font-size</span><span class="p">:</span> <span class="m">16px</span><span class="p">;</span>
  <span class="c">/* Now */</span>
  <span class="nl">font-size</span><span class="p">:</span> <span class="m">18px</span><span class="p">;</span>
<span class="p">}</span></code></pre></figure>

<h2 id="list-style-type">List style type</h2>

<p>In both the <a href="/index.html">main</a> and the <a href="/open_source.html">open source</a> pages, while defining <strong>li</strong> elements the default circle character was being used. I’ve changed the CSS to support an Unicode code point using the <strong>:before</strong> clause.</p>

<figure class="highlight"><pre><code class="language-css" data-lang="css"><span class="nc">.container</span> <span class="nt">ul</span><span class="nc">.posts</span> <span class="p">{</span>
	<span class="c">/* Do not use list decoration */</span>
    <span class="nl">list-style</span><span class="p">:</span> <span class="nb">none</span><span class="p">;</span>
<span class="p">}</span>

<span class="nc">.container</span> <span class="nt">ul</span><span class="nc">.posts</span> <span class="nt">li</span><span class="nd">:before</span> <span class="p">{</span>
	<span class="c">/* Add a before content */</span>
	<span class="nl">content</span><span class="p">:</span> <span class="s1">"\0445"</span><span class="p">;</span>
<span class="p">}</span></code></pre></figure>

<h2 id="predefined-width-on-the-left-side-while-indexing-stuff">Predefined width on the left side while indexing stuff</h2>

<p>The <a href="/index.html">main</a> page shows a the list of posts titles and their posting date. The content was not aligned:</p>

<p>Before:</p>

<li class="post" style="list-style-type:none; padding-left: 30px">
  <span>05 Aug 2012</span>
  <a title="Ensuring Array as object type" href="/2012/08/05/ensuring-array-as-object-type">Ensuring Array as object type</a>
</li>
<li class="post" style="list-style-type:none; padding-left: 30px">
  <span>24 Jul 2012</span>
  <a title="Things I like using python (part II)" href="/2012/07/24/why-python-rocks_and_two">Things I like using python (part II)</a>
</li>

<p>After:</p>
<li class="post" style="list-style-type:none; padding-left: 30px">
  <span class="left_title">05 Aug 2012</span>
  <a title="Ensuring Array as object type" href="/2012/08/05/ensuring-array-as-object-type">Ensuring Array as object type</a>
</li>
<li class="post" style="list-style-type:none; padding-left: 30px">
  <span class="left_title">24 Jul 2012</span>
  <a title="Things I like using python (part II)" href="/2012/07/24/why-python-rocks_and_two">Things I like using python (part II)</a>
</li>

<figure class="highlight"><pre><code class="language-css" data-lang="css"><span class="nc">.left_title</span> <span class="p">{</span>
  <span class="nl">display</span><span class="p">:</span> <span class="n">inline-block</span><span class="p">;</span>
  <span class="nl">min-width</span><span class="p">:</span> <span class="m">110px</span><span class="p">;</span>
<span class="p">}</span></code></pre></figure>



  <p class="back">&laquo; <a href="/">Home</a></p>
</div>

<!--h2>Related Posts</h2>
<ul class="posts_list">
  
    <li class="post">
  <a href="/2025/04/12/configure-mcp-server-with-asdf/" title="Configure MCP server with asdf for Node.js">Configure MCP server with asdf for Node.js</a>
  <span class="date">(12 Apr 2025)</span>
</li>

  
    <li class="post">
  <a href="/2023/03/27/get-pull-requests-metrics-with-chat-gpt/" title="Get Pull Request metrics with chat-gpt">Get Pull Request metrics with chat-gpt</a>
  <span class="date">(27 Mar 2023)</span>
</li>

  
    <li class="post">
  <a href="/2023/03/26/mount-external-volume-write-permissions/" title="How to mount an external volume with write permissions">How to mount an external volume with write permissions</a>
  <span class="date">(26 Mar 2023)</span>
</li>

  
</ul-->

<hr />

<h2>Comments</h2>
<div id="disqus_thread"></div>
<script type="text/javascript">
  /* * * CONFIGURATION VARIABLES: EDIT BEFORE PASTING INTO YOUR WEBPAGE * * */
  var disqus_shortname = 'juandebravo'; // required: replace example with your forum shortname
  //var disqus_developer = 1; // developer mode is on, for testing locally

  /* * * DON'T EDIT BELOW THIS LINE * * */
  (function () {
    var dsq = document.createElement('script'); dsq.type = 'text/javascript'; dsq.async = true;
    dsq.src = 'https://' + disqus_shortname + '.disqus.com/embed.js';
    (document.getElementsByTagName('head')[0] || document.getElementsByTagName('body')[0]).appendChild(dsq);
  })();
</script>
<noscript>Please enable JavaScript to view the <a href="https://disqus.com/?ref_noscript">comments powered by
    Disqus.</a></noscript>

  
  <p class="back">&laquo; <a href="/">Home</a></p>
</div>

8. Write an emoji in a markdown file and run the server

For instance to include a smile, write :smile:

9. Run jekyll

jekyll --server --auto

10. Enjoy

:neckbeard: :squirrel: :bug: :monkey: :scream: :hankey: :smile:

http://juandebravo.com/2012/11/17/emoji-support-in-jekyll/

CSS changes

Oct 28, 2012 Updated Oct 28, 2012

Show full content

Today I’ve been changing a bit the CSS that creates the layout of this site. Almost the whole style was defined by my pal @rafeca while creating his blog. I have borrowed it :smile:.

Although I am not an expert on CSS, I like hacking a bit of this :squirrel:.

I summarize the changes in the following sections.

Bottom links showing a “-“ character

I think this is a common issue in a lot of sites: when you define a link containing an image, it may appear an annoying hyphen character on the right side of the image when the cursor is hover it. It’s been happening in this blog on the footer links.

To change this behavior, remove the img element and define the image as background, ensuring the text inside the a element is indented outside the screen, far away from the visible divs.

    <footer>
      <!-- Before -->
      <a class="fadedlink" href="https://github.com/juandebravo">
        <img src="/gfx/github-logo.png" alt="@github">
      </a>
      <!-- Now -->
      <a class="fadedlink footer_link github" href="https://github.com/juandebravo">
        @github
      </a>
    </footer>

.footer_link {
  width: 25px;
  display: inline-block;
  text-indent: -1000px;
}

.github {
  background: url("/gfx/github-logo.png") no-repeat scroll 0 transparent;
}

Increase body font size

This one is the easiest :sweat_smile:.

body {
  margin: 0;
  line-height: 1.4;
  /* Before */
  font-size: 16px;
  /* Now */
  font-size: 18px;
}

List style type

In both the main and the open source pages, while defining li elements the default circle character was being used. I’ve changed the CSS to support an Unicode code point using the :before clause.

.container ul.posts {
	/* Do not use list decoration */
    list-style: none;
}

.container ul.posts li:before {
	/* Add a before content */
	content: "\0445";
}

Predefined width on the left side while indexing stuff

The main page shows a the list of posts titles and their posting date. The content was not aligned:

Before:

05 Aug 2012 Ensuring Array as object type

24 Jul 2012 Things I like using python (part II)

After:

05 Aug 2012 Ensuring Array as object type

24 Jul 2012 Things I like using python (part II)

.left_title {
  display: inline-block;
  min-width: 110px;
}

http://juandebravo.com/2012/10/28/css-changes/

How to paste code in keynote

Oct 21, 2012 Updated Oct 21, 2012

Show full content

I use Twitter favourites feature to mark as read it later tweets that contain a link that seems interesting. Eventually I tidy up my long list of unread interesting stuff and today I found out about the following item.

There was this tweet from Santiago Pastorino about how to include highlighted code into a Keynote.app presentation:

# Copy the text to the pasteboard from your editor

# <language> should be a programming language supported by pygmentize.

pbpaste | pygmentize -l <language> -f rtf | pbcopy

# Paste the pasteboard content into Keynote.app

As simple as that. Keynote.app recognizes automatically the RTF format and therefore the code is highlighted as defined by pygmentize. Awesome tip!

Two last comments:

if you try to do this using Microsoft PowerPoint for OS X, remember to choose “Special paste”. PowerPoint does not recognize automatically the RTF format, which by the way was developed by Microsoft. Great engineering work (the http://en.wikipedia.org/wiki/Rich_Text_Format with a really poor user experience (“special paste” sucks).
if you are creating a code oriented lecture, my recommendation is you give a try to Terminal Keynote. I’ve used it once with great results.

http://juandebravo.com/2012/10/21/how-to-paste-code-in-keynote/

First steps with Scala

Sep 26, 2012 Updated Sep 26, 2012

Show full content

Last week I started a Coursera online training course about Functional programming using Scala. The course is leaded by Martin Odersky, the creator of Scala language program, so it’s a great chance to improve my little knowledge of funcional programming in general and Scala in particular.

When you start working with a new language, first thing you should do is find out the right tools that will make you feel comfortable while spending several hours learning and writing code.

I’ll summarize the steps I followed on my computer (running Mac OS X).

1. Install the interpreter

brew install scala

2. Install SBT, the build tool

SBT is for Scala what Maven is for Java. I haven’t used it in depth yet, but hopefully it’ll suck less than Maven.

brew install sbt

3. Write your first Scala code using the Scala REPL

Scala provides a built-in interpreter (Read-Evaluate-Print Loop) to write and run Scala code very easily.

You can run the Scala REPL either using sbt or scala command:

# Using SBT
juandebravo [~/bin] λ sbt console
scala>
scala> println("Hello world")
Hello world
# Using scala
juandebravo [~/bin] λ scala
scala>
scala> println("Hello world")
Hello world

4. Configure your IDE of choice. Sublime Text 2

My coworker Toni Cebrian wrote a post comparing IDEs to write Scala code. I’ll not repeat his text but write a bit about my personal choice: SublimeText2. I started using Sublime Text 2 one year and a half ago coding ruby, I stayed on that when I switched to python, and I’ll give it a try for Scala too. I starred some time ago this tweet about how to use Sublime Text 2 for Scala development, and eventually I read it :-)

It seems that the best option is to install the plugin sublime-ensime. It was a good surprise to find out that my pal @casualjim is the original author of this plugin. Sublime-ensime is a plugin that provides integration between Scala and ENSIME, the ENhanced Scala Interaction Mode for Emacs. Follow the instructions in the sublime-ensime github main page to make it work.

Some months ago I created project showing some cool Scala features, check it out in my github repository. To run it using Sublime Text 2, create a new Build system with the following configuration:

{
	"cmd": ["sbt", "test"],
	"selector": "source.scala",
	"working_dir": "${project_path}"
}

http://juandebravo.com/2012/09/26/first-steps-with-scala/

Dynamic loading

Sep 16, 2012 Updated Sep 16, 2012

Show full content

Dynamic loading

In the project I am working right now, we are reusing a Django project to develop two different products, productA and productB :-). These products frequently require the same service layer and the code can be directly reused, but in some scenarios different code is required.

Django project reusage gives us some advantages. Here I highlight some of them:

authentication
authorization
logging
URL routes
testing strategy
security tests
third party integrations

Our current solution to run different logic in both products, instead of using project configuration, is dynamic loading. To implement it, we have defined our own code structure respecting Django default structure:

- django_project_root
  - django_app_1
    - __init__.py
    - services.py
    - projectA
       - __init__.py
       - services.py
    - projectB
       - __init__.py
       - sevices.py
  - django_app_2
    - __init__.py
    - services.py
    - projectB
      - __init__.py
      - services.py

If the Django application is running as projectA, the service logic being used is:

for django_app_1 application, the module django_app_1.projectA.services
for django_app_2 application, the module django_app_2.services (as django_app_2.projectA.services module is not created).

ProjectB has defined specific logic in both Django applications, and therefore no generic one is used.

Find underneath the snippet of code (simplified) we’re using to know which module must be loaded:

def get_module(application_name, project, module):
    try:
        module_name = application_name + '.' + project + '.' + module
        return __import__(module_name)
    except ImportError as ex:
        logger.warn("Unable to import module: {0} -> {1}".format(module_name, ex))
        return __import__(application_name + '.' + module)

# Example

get_module('django_app_1', 'projectA', 'services')

Pretty simple. I’m trying to load a module, and if it fails, I’m logging a warning. Sometimes the warning is the expected behavior (i.e. projectA does not define its own django_app_2 service logic), but it could also happen that an exception is raised while parsing a module code, and therefore the module cannot be imported (i.e. projectB django_app_2 service logic has a syntax error). Not logging this situation will hide possible undesired errors.

This is something that Django is not doing (at least in version 1.3.1) and caused me some paintful last Friday. This command:

python manage.py collectstatic --settings=projectA_settings

generated the following result:

Unknown command: 'collectstatic'

WTF! I was getting crazy as I was pretty sure the django.contrib.staticfiles app was installed in the projectA settings file. After some debugging I came out with the problem: Django is using dynamic loading, I had a missing dependency in a model module, and the projectA settings wasn’t been loaded, therefore collectstatic was not a valid command.

Conclusion

To sum up, be careful while dynamic loading your code and log any possible error that may raise during the process. I do, you and Django should too. Otherwise, weird errors will happen because the root of the problem is being hidden.

http://juandebravo.com/2012/09/16/dynamic-loading/

Baruco, First day

Sep 8, 2012 Updated Sep 8, 2012

Show full content

Introduction

This weekend CosmoCaixa building, in Barcelona, is hosting the BArcelona RUby COnference.

So I’ll summarize the weekend talks, let you know about the awesome speakers, the wonderful venue and the great organization.

Keynote by Scott Chacon. Back to First Principles

Scott Chacon, github cofunder, was the first speaker on stage. He talked us through how he sees the future of working for a software company or at least how he’d like it to be:

No timetable: stop working 8 to 17, start acting as an artist that needs inspiration and work in creative moments.
No vacations: consider that anyone should know when can go on holidays, no rules on that. (I do know that this is something already in place in some companies such as Netflix)
No expenses sheets: give people the freedom to know what/when/how much he may spend.
No managers: consider when or why you need a manager. In those situations, could he be replaced by a software program? :)

All the above defines the way an open source project works nowdays.

Some examples of companies that are questioning the usual way to work:

RubyMotion

RubyMotion is a commercial product (162,61 € a licence, coming soon a free licence alternative) that enables a way to create iOS applications using Ruby language.

Good points:

Reuse your Ruby knowledge instead of learning a new language, new patterns, etc.
Use the expressiveness of Ruby instead of Objective-C syntax
Get rid of XCode and use your IDE/text editor of choice
Community
Commandline: you can use your commandline to build the project
Storyboards
Cocoapods

How to create the simplest RubyMotion application:

motion create sample
cd sample
rake

Speakers pointed out that ARC (Automatic Reference Counter) it’s the gear that enables RubyMotion (so you need at least iOS 5). Underneath, RubyMotion uses a MacRuby.

To them it should not be used for production right now, perhaps in six months time.

My personal opinion, you should learn more than one programming language and at least try the native Objective-C language, create your own opinion about if you should use RubyMotion or any other high level framework such as PhoneGap or Titanium.

Getting consistent behavior across your API

Principles for either internal or external API:

Less Surprise principle
Consistent formatting
Consistent naming and format in the overall API
Handle carefully unexpected responses
Ensure you are not creating bad or missing error messages
Let people know what you accept and give them examples

How to achieve:

Centralize behavior
DRY (dont repeat yourself)

Warm up:

Treat you API like the interface it is
Aim for consitency
Become your own client and challenge your API design

Deconstructing the framework

Gary Bernhardt talked about how useful is the SRP (Single Responsability Principle) while creating a well-defined application that should be maintained during a long period. I do agree. Regarding this principle, he mentioned Rails controller as a piece of software that, by default, is in charge of different tasks: authorization, authentication, service logic, form validation, serialization. He has been working on that OS project called raptor that is a proof of concept, not ready to use in production, about how to split code and separate reponsabilities in a Rails application.

Life beyond HTTP

Consider a protocol as a tool you can use to improve your system. About application protocols:

SMTP (Simple Mail Transfer Protocol)
DNS (Domain Name System)
XMPP (Extensible Messaging and Presence Protocol)
IRC
SSH (Secure Shell)
STOMP (Streaming Text Oriented Messaging Protocol)
SPDY (): Multiplexed HTTP:
- open an HTTP connection and can send several requests in parallel.
- request priorization
- compressed headers
- server pushed streams

Why Agile

Software engineering has failed during the past decades while trying to achieve:

Reduce code
Eliminate human errors
Eliminate project variability

Paolo suggests using the scientific method when developing software:

observation
hypothesis
experiment

Something that has become lately to the news as part of the Lean Startup approach.

Lighting talks

Last, but not least, there has been ten lightings talks (5’ each) about different stuff like graph databases and the new service by aentos called GrapheneDB

Warm up

Great speakers (Scott Chacon, Anthony Eden, Paolo Perrotta), great contents (API uniformity, Deconstructing the framework), cool people, and two new t-shirts for my ever growing collection.

http://juandebravo.com/2012/09/08/baruco-saturday/

Ensuring Array as object type

Aug 5, 2012 Updated Aug 5, 2012

Show full content

It happens often that you need an array as object type, and even if you have defined your API like that, you still want to double check that the parameter received in your method is an array:

def foo(param)
  param = Array[param] if !param.is_a?(Array)
  # Service logic here
end

Fortunalety in ruby you have the following feature that converts your parameter to an array, or does nothing in case it is already an array:

def foo(param)
  param = Array(param)
end

This feature does not exist in python, or I haven’t found it though.

The following snippet covers that funcionality:

# list_utils.py

def list_(values, *more_values):
    """
    Creates a list using one or more elements.
    If the parameter is a list, do nothing
    """
    if not isinstance(values, list):
        values = [values]
    if len(more_values) > 0:
        values.extend(more_values)
    return values

Example:

from list_utils import list_

def send_mail(destinations, *args, *kwargs):
    destinations = list_(destinations)

    for destination in destinations:
        # your service logic

You can check the behavior with the following chunk of unit tests:

import unittest

from list_utils import list_

class ListTests(unittest.TestCase):

    def test_parameter_is_a_string(self):
        self.assertEquals(list_("foo"), ["foo"])

    def test_parameter_is_an_array(self):
        self.assertEquals(list_(["foo"]), ["foo"])

    def test_parameter_is_an_integer(self):
        self.assertEquals(list_(1), [1])

    def test_parameter_is_a_list_with_multiple_elements(self):
        self.assertEquals(list_([1,2,3,4]), [1,2,3,4])

    def test_parameter_is_multiple(self):
        self.assertEquals(list_(1,2,3,4), [1,2,3,4])

    def test_parameter_is_none(self):
        self.assertEquals(list_(None), [None])

if __name__ == '__main__':
    unittest.main()

http://juandebravo.com/2012/08/05/ensuring-array-as-object-type/

Things I like using python (part II)

Jul 24, 2012 Updated Jul 24, 2012

Show full content

Introduction

Here it goes my second and last post on the “things that I like about coding in python” series in which was missing a small section on how cool these features (below) are:

Decorators
Context managers
Use blank spaces to define code blocks

Decorators

A decorator is a function that takes at least an argument, a function object, and returns a single value, a function object. It’s commonly used to, taking advantage of python closures support, add new features to the original function object (the one received as argument).

Decorators were defined in PEP 318 as a way to ease the definition of class and static methods.

Before decorators reached python, the following excerpt was needed to create a class/static method:

class Foo(object):
    
    def bar(self, name):
        return "Hello {0} from my class method".format(name)

    def bazz(name):
        return "Hello {0} from my static method".format(name)
    
    # Convert bar from instance to class method
    bar = classmethod(bar)
    
    # Create a static method
    bazz = staticmethod(bazz)

if __name__ == '__main__':
    print Foo.bar('John Doe')
    print Foo.bazz('John Doe')

Reminder: the main difference between static and class methods is that a class method can be overriden by a child, which is not true for a static method. Also a class method needs the class object as first parameter in the method definition. At this point, I don’t find a good reason to define a static method though.

Using decorators and its syntax sugar (the @ symbol), the previous excerpt can be re-written to:

class Foo(object):
    
    # Define bar as a class method
    # As first parameter we could use 'self', but the facto standard is to use 'cls' while
    # defining class methods
    @classmethod
    def bar(cls, name):
        return "Hello {0} from my class method".format(name)

    # Define bazz as a static method
    @staticmethod
    def bazz(name):
        return "Hello {0} from my static method".format(name)
    
if __name__ == '__main__':
    print Foo.bar('John Doe')
    print Foo.bazz('John Doe')

The best blog post explaining decorators I’ve found so far is this one from Steve Ferg. I find decorators quite useful when you need transversal functionalities in your code. A set of examples can be found in wiki.python.org.

Example

The following code creates a decorator that dumps the arguments that a function/method receives when executed. The base of that code is taken from wiki.python.org#Easy_Dump_Of_Function_Arguments.

# decorator_utils.py

import logging

logging.basicConfig(level=logging.DEBUG)

def dump_args(func):
    # get function arguments name
    argnames = func.func_code.co_varnames[:func.func_code.co_argcount]

    # get function name
    fname = func.func_name
    logger = logging.getLogger(fname)

    def echo_func(*args, **kwargs):
        """
        Log arguments, including name, type and value
        """
        def format_arg(arg):
            return '%s=%s<%s>' % (arg[0], arg[1].__class__.__name__, arg[1])
        logger.debug(" args => {0}".format(', '.join(
            format_arg(entry) for entry in zip(argnames, args) + kwargs.items())))
        return func(*args, **kwargs)

    return echo_func

# example.py

from decorator_utils import dump_args

class UserModel(object):
    """
    User model object
    """

    def __init__(self, user_id=None):
        self.user_id = user_id

    @classmethod
    @dump_args
    def find_by_id(cls, user_id):
        pass

    @dump_args
    def update(self, **kwargs):
        pass

    def __str__(self):
        return unicode(self).encode('utf-8')

    def __unicode__(self):
        return str(self.user_id)


@dump_args
def f1(user_id, arg1, arg2, **kwargs):
    pass


if __name__ == '__main__':
    UserModel.find_by_id('foo')

    u = UserModel('879234-32423423')
    u.update(name="John", surname="Doe")
    f1(u, 2, 3)

    f1(u, 2, 3, foo='bazz', bar=23)

The execution of the previous code generates the following output:

λ python example.py

DEBUG:find_by_id: args => cls=type<<class '__main__.UserModel'>>, user_id=str<foo>
DEBUG:update: args => self=UserModel<879234-32423423>, surname=str<Doe>, name=str<John>
DEBUG:f1: args => user_id=UserModel<879234-32423423>, arg1=int<2>, arg2=int<3>
DEBUG:f1: args => user_id=UserModel<879234-32423423>, arg1=int<2>, arg2=int<3>, foo=str<bazz>, bar=int<23>

As it’s shown above (method find_by_id), you can use more than one decorator in a function/method (and are executed bottom-up).

Context managers (with statement)

A context manager allows you to create and manage a run time context. It is created when starting a with statement, it’s available during the code execution inside the with block, and is exited at the end of the with code. The most commonly used scenario is while allocating resources: a context manager ensures you use the resource only while it’s actually required and deallocates it when it should not be used anymore (of course python needs you to write the code properly for that if you are defining your own context manager).

The basic example using python native library to handle a file object:

with open('/var/log/events.log', 'w') as f:
    n = f.write("New user created")

The file /var/log/events.log is opened when entering into the context manager, and closed when the code block is finished. You don’t need to catch exceptions, close the file, etc.

To create a context manager you need to define a class that implements two methods, __enter__ and __exit__. In the following code I’m creating a context manager, user, that retrieves an object from an external source and stores it back if updated:

class User(dict):
    """
    Database object
    """
    def __init__(self, user_id, **kwargs):
        self.user_id = user_id
        self.update(kwargs)

    def has_changed(self):
        # logic to check if any user property has been updated
        return True

class user(object):
    """
    Context manager for a User object
    """

    def __init__(self, user_id):
        self.user_id = user_id

    def __enter__(self):
        """
        This code block is executed while entering a context manager
        """
        # mock that returns always a basic User
        self.user = User(self.user_id, name="John", surname="Doe")
        return self.user

    def __exit__(self, _type, value, tb):
        """
        This code block is executed at context manager exit
        """
        if self.user.has_changed():
            # here the save logic
            pass

if __name__ == '__main__':
    with user('00000-11111') as u:
        u['name'] = 'Johnny'

Switching to ruby, something similar can be achieved with the following snippet:

class User < Hash
  attr_reader :user_id
  
  def initialize(user_id, params = {})
    @user_id = user_id
    self.update(params) if params.length > 0
    if block_given?
      yield self
      if self.has_changed?
        # here the save logic
      end
    end
  end
  
  def has_changed?
    # logic to check if any user property has been updated
    return True
  end
  
  class << self
    def find!(user_id)
      # mock that returns always a basic User instance
      if block_given?
        User.new(user_id, {name:"John", surname:"Doe"}, &Proc.new)
      else
        User.new(user_id, {name:"John", surname:"Doe"})
      end
    end
  end
end

User.find!('0000-1111') do |u|
  u[:name] = "Johnny"
end

Use blank spaces to define code blocks

Not too much to say about this. I thinks it increases readability.

Conclusion

I hope you’ve found these articles interesting. I’m sure some good points, like functions being first-class citizens or the collections and functools modules, are missing but at this point I just wanted to highlight my five coolest features. Let me know which are yours so my top five list could easily be “re-prioritized” :-)

http://juandebravo.com/2012/07/24/why-python-rocks_and_two/

Things I like using python (part I)

Jul 16, 2012 Updated Jul 16, 2012

Show full content

Introduction

As you know these lasts months I’ve spent quite some time coding python, the language chosen for the project to which I’ve devoted heart, soul and most of my weekends too…

During the first weeks I really struggled to get the code alive as during the previous two years it was all bout ruby, so python has taken me out of my comfort zone which really hit me. Although I cannot call myself a python expert (yet), I’m enjoying this new friendship.

Things I like using python

I’m going to share some of my favourites features and I’d like to know yours, yours thoughts about them and any missed bit that may be key:

List comprehensions
Generators
Decorators
Use blank spaces to define code blocks
Context managers

Find below a brief description and example about the first two dots.

List comprehensions

As python doc says, “list comprehensions provide a concise way to create lists”.

Example

users = [{'name': 'John Doe', 'email': 'john@doe.com'},
		 {'name': 'Mike Cunhingam', 'email': 'mike@cunhingam.com'}]

# Retrieve users email
emails = [user['email'] for user in users]

Of course something similar can be done in ruby, but after some weeks I felt comfortable with the idea of iterate over objects in an array without calling a specific object method:

users = [{name: 'John Doe', email: 'john@doe.com'},
		 {name: 'Mike Cunhingam', email: 'mike@cunhingam.com'}]

# Retrieve users email
emails = users.map{|x| x[:email]}

List comprehensions can be used with any iterable object, as strings and arrays instances.

Generators

Again reading through python docs, “generators are a simple and powerful tool for creating iterators”, covered in PEP255. Generators may be used when you need to maintain state between values produced and allows you to avoid callback functions.

Let’s imagine that Github API only allows to download an user gist per API call. In the example below we’re using a generator to create an iterator over user gists. To retrieve an user gist we’re maintaining the state between calls (the current page) and we’re retrieving the data only when is actually needed. Of course another approach could be to retrieve a chunk of gists and return them upon request, but it seems a good example about how to use generators :-). Kudos to @anarchyco for inspiring me to write this code.

To be continued…

http://juandebravo.com/2012/07/16/why-python-rocks/

Fix latest commit branch using git

Jun 26, 2012 Updated Jun 26, 2012

Show full content

Introduction

I like git, I hope you like it too. I like using the git flow branching model while developing new features or fixing undesired bugs in a stable version.

Sometimes while you’re starting a new feature, you forget changing your current branch and commit a change to the wrong branch (often develop). Sometimes while you’re starting a new feature, you forget changing your current branch and end up committing a change to the wrong branch (often develop).

Let’s walk through the steps required to move your latest commit from the wrong branch to the right one:

1.- Prerequisites

Create the repository with an initial content and create the develop branch.

bundle gem foo_bar
cd foo_bar
git commit -am "first commit"
git checkout -b develop
# modify version.rb file to increase the version number
git commit -am "bump version"

Create your feature branch

git branch feature/hello
git lg                                                                                                                                   
* d62b49a - (HEAD, feature/hello, develop) bump version - juandebravo (66 seconds ago)
* 9158f06 - (master) first commit - juandebravo (2 minutes ago)

Commit a change to develop (instead of your feature branch)

echo "require 'foo_bar/version'\n\nmodule FooBar\n  def self.hello(name)\n    print \"hello #{name}\"\n  end\nend" > lib/foo_bar.rb
git commit -am "add hello method"

WRONG! You commit to develop instead of feature/hello

git lg
* ecd59c3 - (HEAD, develop) add hello method - juandebravo (2 seconds ago)
* d62b49a - (feature/hello) bump version - juandebravo (40 minutes ago)
* 9158f06 - (master) first commit - juandebravo (40 minutes ago)

2.- Fix the wrong commit

Merge develop to your feature branch

git checkout feature/crud_users
git merge develop
git lg
* ecd59c3 - (HEAD, feature/hello, develop) add hello method - juandebravo (3 minutes ago)
* d62b49a - bump version - juandebravo (43 minutes ago)
* 9158f06 - (master) first commit - juandebravo (44 minutes ago)

Now you already have the change in your feature branch. Now it’s time to remove it from develop branch.

Remove the commit from develop branch

git checkout develop
git reset --hard HEAD~1
git lg
* ecd59c3 - (feature/hello) add hello method - juandebravo (4 minutes ago)
* d62b49a - (HEAD, develop) bump version - juandebravo (44 minutes ago)
* 9158f06 - (master) first commit - juandebravo (44 minutes ago)

That’s all!!

http://juandebravo.com/2012/06/26/git-tip-how-to-change-the-branch-of-a-commit/

Yet another blog in the world

May 31, 2012 Updated May 31, 2012

Show full content

Hello world

During the last months I’ve felt an increasing need to share my thoughts with the tech world for the following reasons:

I think it’s a good way to keep on learning English. Yes fellas, I’ve been working on Shakespeare’s language for a while but there is a whole road of new words and expressions to take in. Sad but true. So, never give up. Learn!
One of my new year resolutions was to embrace a new programming language. It’s incredible how much you can learn in a travel like that and sometimes I need to share some bits of that new knowledge. Some of those might be could obvious to you but they are new to me that’s why they are here. As my motto says: “never stop learning”.

Why Jekyll and not Wordpress?

In 2011 I’d been working mainly in an innovation project where we had the chance to test and use different tools to build our platform and service. It was an awesome adventure. One of the things we did was creating all the documentation our service needed using Markdown format and running a Rake task to generate the HTML pages that any HTTP server in the world could serve. Yes, we did not need any hard editorial flow to follow, we were innovators. Kudos to @osuka for letting me that freedom.
ruby, github and git are my tools of choice whenever I can use them and I’m a big fan of UNIX shell so it’s kind of familiar environment for me. Kudos to GitHub pages for supporting Jekyll. I can use git flow to write my blog!
Last, but not least, my fellow @rafeca started a blog during our international assignment in Jajah. I’m using Jekyll and I have borrowed his layout to build this blog. Hopefully I’ll write more often than he does, though. For me it’s also a way to share experiences with him now that ours careers have diverged :-) Please check his explanation about how to create a blog like this from scratch.

What will be this about

Well, I’m always reading new stuff, you know, never stop learning. I’ll try to show some of the new stuff I’m reading about. At this point, could be something related to:

ruby, my programming language of choice.
python, the language I’m currently using more widely.
git, the SCM that is amazing IMO.
Adhearsion, the open source telephony framework I was actively contributing for some months and that is being leaded by the awesome Ben Klang, Ben Langfeld and Jason Goecke.
Freeswitch and ESL, the telephony platform I’m hacking with (not so often though)
Scala: learning Scala was my first new year resolution, I’ve started learning now and I do like to syntax. I don’t have yet a strong opinion about it that seems quite amazing.

http://juandebravo.com/2012/05/31/yet-another-blog-in-the-world/

https://feeds.feedburner.com/juandebravo

Posts