Django projects often call for a robust, powerful setup to ensure a smooth development and deployment process. Cookiecutter Django is a popular framework that aims to offer Django users a comprehensive, out-of-the-box setup, including configurations for databases, templates, and much more. Cookiecutter Django exclusively supports PostgreSQL, reflecting its intention for production-level applications where PostgreSQL's advanced features can be a real asset. However, there might be scenarios where a developer wants to use SQLite instead, perhaps for small-scale applications, quick prototypes, or simply due to personal preference or familiarity with SQLite.
In the world of data warehousing and large-scale data analysis, table formats like Apache Iceberg play a pivotal role in managing massive datasets. If you're dealing with Iceberg tables, maintaining and optimizing them can lead to significant performance improvements. Let's delve into the three key ways to maintain Iceberg tables for optimized performance and data management.
Partitions Partitioning your Iceberg tables is one of the simplest and most effective methods to boost performance.
After the release of macOS 10.15.2 two days agao I have upgraded my mac at work to latest version today. Immediately, after running pip to install some packages I was greeted with an abort. I checked the crash reporter to find the offender:
Application Specific Information: /usr/lib/libcrypto.dylib abort() called Invalid dylib load. Clients should not load the unversioned libcrypto dylib as it does not have a stable ABI. After poking around for a bit I figured out it was because of the asn1crypto library.
When dealing with geospatial data it is sometimes useful to have a grid at hand that represents the given data. One way to create a grid like this is to use Geohashes. GeoHashes are a hierarchical spatial data structure which subdivides space into buckets of grid shape, which is one of the many applications of what is known as a Z-order curve, and generally space-filling curves. A Geohash is an encoded character string that is computed from geographic coordinates.
On October 5th the PostgreSQL Global Development Group announced the release of PostgreSQL 10. It comes with tremendous amount of new features like
Table partitioning Logical replication Improved parallel queries Stronger password hashing Durable Hash Indexes and more. A nice list, including explanations can be found on Robert Haas’ blog.
This post explains how to upgrade to the latest version of PostgreSQL on macOS using Homebrew. At the time of this writing I was using macOS 10.
After three great days at the PyCon US 2017 in Portland, OR Hendrik and I decided to participate in the development sprints succeeding the conferece. The code sprints are an essential part of PyCon, and a chance to meet some of the maintainers and contributors of various open source projects. For us it was the first time attending a code sprint.
The day before the sprint there was a session helping people to set up Git, Python (including virtual environments) and getting familiar with version control.
Google Ads, the globally renowned advertising platform, empowers numerous businesses to strategically place ads, reach prospective customers, and grow their presence. The kaleidoscope of data that Google Ads provides forms the bedrock of insightful business decisions, higher return on investment, and the optimization of AdWords campaigns.
While Google Ads features a user-friendly interface for data access and management, some tasks often benefit from a programmatic approach. In response to this need, Google provides the AdWords API.
If you want to get in contact with me, shoot me an email or connect via
https://twitter.com/Tafkas https://github.com/Tafkas https://gitlab.com/Tafkas https://linkedin.com/in/stadeschuldt https://google.com/+ChristianStadeSchuldt
Recently, I set up Jupyter Notebooks on a server at work. The idea was to create an enviroment where every team member could run analyses using Python and share the results with the rest.
After reading the documentation, I found out that the Jupyter Notebook web application comes with a Contents API
I quickly put together a little Munin script that collects some statistics about the current notebooks.
The graph shows the total number of notebooks on the server as well as the currently open notebooks:
There are a lot of cases when we want to track time when an entity was created or updated. Here is a simple recipe to make some or all of your SQLAlchemy entities auto-timestamping. To achieve this, we will provide a mixin class.
from datetime import datetime from sqlalchemy import Column, DateTime, event class TimeStampMixin(object): """ Timestamping mixin """ created_at = Column(DateTime, default=datetime.utcnow) created_at._creation_order = 9998 updated_at = Column(DateTime, default=datetime.utcnow) updated_at.
Everyday at work around noon the question of where to get lunch comes up. Normally, we choose between different restataurants in the vicinity of the office. One exception is the HU Mensa (university cafeteria). Despite being really cheap the food quality there varies a lot and it really depends on the daily menu whether a visit is worthwhile.
To tackle this issue I decided to spent another IT Open Space putting together a little script that will help us in the future.
At Project-A we are using Codebase as a project management tool together with its version control. Just as with any other tools you can create tickets and organize them in sprints. Our usual (very simplified) workflow includes:
Sprint planning for tickets Priotizing tickets Developer working on tickets Product managers verifying if the tickets were implemented as intended Unfortunately, sometimes your backlog keeps growing and tickets are no longer valid, outdated or, in the worst case, just forgotten.
An ETL import graph is build on logical dependencies of the jobs to each other. So typically a SQL transformation job depends on all the previous jobs that create the tables used in the query. But once there are a certain number of jobs, dependencies often get a bit more complicated and some of them become redundant in the process.
A simple example can be seen in the dependency graph from figure, where the three red edges are redundant.
To speed up the ETL data pipeline, you should try to run jobs in parallel. Obviously, not all jobs can run at the same time in most cases, since there are dependency constraints between the jobs and limits of the servers capacity (number of processors and/or IO bandwidth).
So assuming the server allows you to run n jobs in parallel, often there is the situation that the dependencies give you the option to run any of a set of m different jobs with m > n.
Once you have set-up a web server like Apache or nginx running on the Raspberry Pi it is time to create a website. From here there a several options: A CMS that relies on a database, some purely manual crafted pages or a static pages generated by a script. I chose the latter for some reasons.
Static sites have a lot of advantages:
no database to slow requests down offer greater security, as they do not contain dynamic content, so are immune to the most common attacks flat, text files, makes them ideal to be used with version control systems, such as Git low footprint on the server as serving raw html files But there also some limitations:
I have been using Munin to monitor the health of my Raspberry Pi for while now. As I have more devices installed in my network I was looking for a way to monitor these devices as well. As Munin uses a client-server model you are required to install the Munin node on the device to be monitored. Every five minutes the Munin server polls its clients for the values and creates charts using RRDTool.
After collecting some photovoltaic data using PikoPy and a some readings from the residential meter it was time to put everything together. The data is collected by a couple of scripts triggered by a cronjob every five minutes.
$ crontab -l */5 * * * * python /home/solarpi/kostal_piko.py */5 * * * * python /home/solarpi/collect_meter.py */15 * * * * python /home/solarpi/collect_weather.py The results are then written into a SQLite database.
The first step of my plan, building a Raspberry Pi based photovoltaic monitoring solution, is finished. I created a python package that works with the Kostal Piko 5.5 inverter (and theoretically should work with other Kostal inverters as well) and offers a clean interface for accessing the data:
import pikopy #create a new piko instance p = Piko('host', 'username', 'password') #get current power print p.get_current_power() #get voltage from string 1 print p.
I have been carrying my Fitbit One for a little over two years with me and it keeps tracking my daily steps. It also tracks my distance covered by multiplying those steps using the stride length which you can either provide explicitly or implicitly setting your heights. In the winter of 2012 I bought my first ~Garmin Forerunner 410~ (replaced by a Garmin Forerunner 920XT) GPS watch to help me track my running (and other outdoor) activities.
A friend of mine had a photovoltaic system (consisting of 14 solar panels) installed on his rooftop last year. As I was looking for another raspberry pi project I convinced him I would setup a reliable monitoring solution that will lead him to an access to the data in real-time data. The current setup comes with an inverter by the company Kostal.
The Kostal Piko 5.5 runs an internal web server showing statistics like current power, daily energy, total energy plus specific information for each string.
I have been tracking my sleep for almost two years now using my Fitbit. I started with the Fitbit Ultra and then moved on the the Fitbit One after it came out. In October 2013 I found out about the Sleep Cycle (Link) app for the iPhone. For weeks, Sleep Cycle was listed as the best-selling health app in Germany, where currently (as of January 2014) it is in second place.
After the 2013 Berlin Marathon sold out in less than four hours, the organizers decided to alter the registration process for 2014. First there was a pre-registration phase followed by a random selection from the pool of registrants to receive a spot. Those who were selected had to register until November 11th, 2013. Any spots that were not confirmed till the 11th would be offered to pre-registered candidates according to the order in which they were randomly selected.
Two days ago the official hard-float Oracle Java 7 JDK has been announced on the official Raspberry Pi blog. Prior to this there was only the OpenJDK implementation which was lacking performance.
Furterhmore the Raspberry Pi Foundation announced that future Raspbian images would ship with. Oracle Java by default.
If you want to give it a spin you can install the JDK with:
$ sudo apt-get update && sudo apt-get install oracle-java7-jdk
If you work a lot on the command line you are probably familiar with the top utility to see what process is taking the most CPU or memory. There’s a similar utility called htop, which is an advanced, interactive system-monitor utility that can be used as a replacement tool for the default process monitoring command ‘top’ on a Linux ecosystem. This interactive process viewer provides a real-time, dynamic view of what’s happening on your Raspberry Pi system.
After I moved back from New Jersey in June 2008 I started to track my body weight more seriously. My routine usually consists of getting up and after finishing the morning bathroom I would step on my scale. That way I try to ensure that the condition for each weighing are as similar as possible. I recorded my weight on paper and eventually would put everything into a spreadsheet for further analysis.
One of the most important features in quantified self is the ability to export your data in an open format. Fitbit lets you download your personal data if you subscribe to a premium membership. Alternatively they provide an API at dev.fitbit.com/ that allows developers to interact with Fitbit data in their own applications, products and services.
In a blog post at quantifiedself.com Mark Levitt shows a way how to export your Fitbit data into Google Spreadsheets.
If you are overclocking your Raspberry Pi or you just curious how hot this little guy gets, there are two ways to get the internal temperature. Assuming you are running Raspbian as your operating system.
Method 1:
$ /opt/vc/bin/vcgencmd measure_temp This gives you the temperate in in degrees Celsius: temp=54.1'C
Method 2:
If you need the temperature to be more precise (e.g. storing it in an database or for further processing) use the following command:
If you log into your Raspberry Pi using ssh it will prompt you for a password. Having to do this multiple times a days this is very annoying. To ease the pain, and enhance security, you can use public key authentication instead. Therefor you create a pair of keys on your client, and store the public key on your Raspberry Pi. Then you set up an authentication by key. Afterwards the user can login into the Raspberry Pi using his private key.
In order to visually enhance my temperature logging I added some Javascript that computes sunrise and sunset for the 24h, 28h, weekly and monthly chart. Then I use this information to plot vertical bands on the chart indicating the effects of the sun on temperatures (and humidities):
To add the bands to your Highchart just get the sunrise and sunset value for a particular day and push it on the xAxis.
**tl;dr Checkout the charts on my RaspberryPi **
For quite a long time I was looking for a way to monitor and record th temperature and humidity at my apartment. What was missing was a convenient, preferably wireless solution. After receiving my RaspberryPi I started to look into that more intensively.
USB-WDE1 Receiver The USB Weather Data Receiver USB-WDE1 wirelessly receives data from various weather sensors of ELV at 868 MHz.
Once you have set up your Raspberry Pi chances are that you want to access it from remote machine or host a little web site on it. The problem is that your provider usually gives you a dynamic IP, which changes every time you connect to the Internet. In Germany most (A|V)DSL provider reset your connection every 24h. The solution for this is a dynamic DNS (DDNS), which automatically updates the name server in the Domain Name System (DNS).
A couple of years ago I was on a trip to Budapest with a couple of friends. While roaming the streets we were passing by a casino and my friend insisted that there was a perfect strategy that would only lead to winning at roulette tables. Curious as I was I had him explain his theory. The system basically works as follows:
First, you place a coin on red. If red wins, take your winning and start over.
Recently I ran the St. Pat's 10 Miler in Atlantic City, Nj. It was my first official running event ever and I enjoyed it lot.
Shortly after the race the official results have been posted on the Internet. The data did not only include the number and times of the participants but also gender and age. Looking at the finisher time distribution it shows that most runners finished at around 90 minutes: