CPA - Create-Python-App

CPA - Create-Python-App

CPA Logo

create-python-app is a cli tool for ultra fast setup of new Python projects. It automates the creation of config files for style & lint checks, gitignore, a basic Dockerfile and Poetry for dependency management. An opinionated set of pre-commit hooks are included for enforcing best practices and reducing dev time.

An example output is provided in ./example

Installation

MacOS, Linux

Install via script below or get it from Releases

1
curl -sSL https://raw.githubusercontent.com/ysawa0/create-python-app/main/install.sh | bash
1
2
3
# cpa will be installed to ~/bin/cpa
# add ~/bin to your PATH
# eg: echo "export PATH=$PATH:~/bin" >> ~/.zshrc

Windows

Download latest binary from Releases page

Building from source

1
2
# cd to project
cargo install --path .

Usage

To create a new project:

1
cpa create --name myproject

Optional params:

  • --preset: Specifies a Python version for the project. Defaults to “python3.10”

Example:

1
cpa create --name myproject --preset python3.10

Goals

  • Speed up Project Creation: Reduce the time spent on repetitive setup tasks
  • Best Practices: Encourage best practices for code quality, formatting, and style by including configs for tools like black, isort, and flake8.
  • Automation: Automate tasks such as generating .gitignore files, setting up pre-commit hooks, and configuring code linters and formatters.
  • Golang, Rust support planned

Contributions and Feedback

Users are welcome to contribute to the project by submitting pull requests or opening issues for bugs and feature requests. Feedback is also greatly appreciated to help improve the tool.

Envoylint.com

Envoylint.com

Validate your Envoy configs from your browser

envoylint.com

What is this?

This site takes a Envoy config and validates it for you. Run against actual Envoy binaries with multiple versions supported (1.16.2, 1.16.0, 1.14, 1.12)

How does it work?

It sends the config to a Lambda running Envoy in validate mode or against the config_load_check_tool and prints the results.
There is a 30 second timeout on the linter due to API Gateway limitations. Extremely large configs may reach that.

Do you save any data?

No and all sessions run on ephemeral Lambda containers. But it’s best to never send any sensitive data.

Can I see the code?

https://github.com/ysawa0/envoylint

Sniff Kubernetes Pod requests and headers using tpcdump

Sniff Kubernetes Pod requests and headers using tpcdump

When your fancy observability tools have failed you there’s still trusty tpcdump

The was done on Ubuntu. YMMV on other distros.

Exec into a pod

1
kubectl exec -it my-pod-name sh

Run

1
2
cat /sys/class/net/eth0/iflink
> 588 # container eth id

It should return a number, the container eth id

Now run

1
kubectl describe po my-pod-name | grep Node

To find out the node it’s running on

SSH into the node then run below to find the eni id

1
2
ip link | grep 588
> eni89aabc12345

Now use tcpdump to sniff the requests coming in

1
tcpdump -A -i eni89aabc12345

To capture a specific header

1
tcpdump -A -i eni89aabc12345 | grep -i X-Real-IP -C 5
AWS Double Charges Cross AZ Traffic and Two Solutions

AWS Double Charges Cross AZ Traffic and Two Solutions

Recently I saw an intriguing article by Corey Quinn of lastweekinaws.com. His finding was that although AWS lists cross AZ traffic as costing $0.10 / GB on their documentation, cross AZ traffic actually costs $0.20 / GB; they double charge – charging you for data going out of an AZ and again for going into the other AZ.

This pricing is the same as cross region traffic – $0.20 / GB !

When I read this, it was hard to believe! For the last couple of years, I’ve always thought that cross AZ traffic is cheaper than cross region. Many of my co-workers believed the same as well.

I just had to recreate the cross AZ tests in the post to convince my self.

The test is simple as Corey was kind enough to detail the steps. I’ve added some additional details below to make it even easier to recreate.

I chose a region where I had absolutely nothing running in it to factor out any external factors.

The Bill

Per the test, I sent 10GB of traffic from an EC2 in us-west-2a to another in us-west-2b. Once the bill came in, I was charged for 20GB of data. So every 1GB transferred counted as 2.

Cross AZ bills

Talking to AWS Support

After reaching out to AWS, they confirmed the results I saw. While the docs do not expliclity state that cross AZ costs are “double charged”, it does state that data “is charged at $0.01/GB each direction.”
Ingress egress price

Reducing cross AZ data costs

As most organizations today deploy their services across multiple AZs for high availability, it’s difficult to reduce cross AZ data without the right architecture.

Here are two concepts being used today to combat rising cloud costs (note: implementing either is not easy!)

Service mesh (Envoy, Istio, etc.)

By adopting service mesh architecture, it’s possible to force service to service communication to be within the same AZ.

For example, if a service A container is running in us-east-1a, a service mesh sidecar container running alongside it can ensure all requests goto services also running in 1a.

Implementing this with Envoy can be done through the zone aware routing feature or by using the Endpoint Discovery Service to dynamically send instances of your service endpoints (IP addresses) that are in the same AZ.

Cluster per AZ

This is a model being adopted by companies at scale like Tinder, Lyft and Reddit. They’re making it possible by using Kubernetes. The idea here is to have a production cluster per AZ, where each cluster has a complete replica of all your services – each cluster is an independent copy.

Cluster / AZ. Image taken from Reddit k8s talk

All communication within each cluster is done in the same AZ as the traffic is pinned inside.

Of course, any shared data stores must be located outside the clusters. Managed data stores such as S3, DynamoDB and ElastiCache make a good fit.

Replay log files stored locally with ShadowReader load testing

Replay log files stored locally with ShadowReader load testing

ShadowReader can parse logs stored locally and push it to S3, so that it can be replayed by the load testing Lambdas.

The only requirements are that:

  • Logs must be in a consistent format.
  • You must supply a RegEx to instruct the script of the log format.
  • You must supply the time format for the timestamps in the logs.

Below is an example of how to parse logs stored in the default Nginx log format

1
2
3
log_format combined '$remote_addr - $remote_user [$time_local] '
'"$request" $status $body_bytes_sent '
'"$http_referer" "$http_user_agent"';

How to

First, save the below to a logs.txt file.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
10.168.166.132 - - [15/Mar/2019:04:12:24 +0000] "GET / HTTP/1.1" 403 23 "-" "ELB-HealthChecker/2.0" "-"
10.168.168.78 - - [15/Mar/2019:04:12:31 +0000] "GET / HTTP/1.1" 403 23 "-" "ELB-HealthChecker/2.0" "-"
10.168.166.132 - - [15/Mar/2019:04:12:39 +0000] "GET / HTTP/1.1" 403 23 "-" "ELB-HealthChecker/2.0" "-"
10.168.168.78 - - [15/Mar/2019:04:12:46 +0000] "GET / HTTP/1.1" 403 23 "-" "ELB-HealthChecker/2.0" "-"
10.168.166.132 - - [15/Mar/2019:04:12:54 +0000] "GET / HTTP/1.1" 403 23 "-" "ELB-HealthChecker/2.0" "-"
10.168.168.78 - - [15/Mar/2019:04:13:01 +0000] "GET / HTTP/1.1" 403 23 "-" "ELB-HealthChecker/2.0" "-"
10.168.166.132 - - [15/Mar/2019:04:13:09 +0000] "GET / HTTP/1.1" 403 23 "-" "ELB-HealthChecker/2.0" "-"
10.168.168.78 - - [15/Mar/2019:04:13:16 +0000] "GET / HTTP/1.1" 403 23 "-" "ELB-HealthChecker/2.0" "-"
10.168.166.132 - - [15/Mar/2019:04:13:24 +0000] "GET / HTTP/1.1" 403 23 "-" "ELB-HealthChecker/2.0" "-"
10.168.168.78 - - [15/Mar/2019:04:13:31 +0000] "GET / HTTP/1.1" 403 23 "-" "ELB-HealthChecker/2.0" "-"
10.168.166.132 - - [15/Mar/2019:04:13:39 +0000] "GET / HTTP/1.1" 403 23 "-" "ELB-HealthChecker/2.0" "-"
10.168.168.78 - - [15/Mar/2019:04:13:46 +0000] "GET / HTTP/1.1" 403 23 "-" "ELB-HealthChecker/2.0" "-"
10.168.166.132 - - [15/Mar/2019:04:13:54 +0000] "GET / HTTP/1.1" 403 23 "-" "ELB-HealthChecker/2.0" "-"
10.168.168.78 - - [15/Mar/2019:04:14:01 +0000] "GET / HTTP/1.1" 403 23 "-" "ELB-HealthChecker/2.0" "-"
10.168.166.132 - - [15/Mar/2019:04:14:09 +0000] "GET / HTTP/1.1" 403 23 "-" "ELB-HealthChecker/2.0" "-"
10.168.168.78 - - [15/Mar/2019:04:14:16 +0000] "GET / HTTP/1.1" 403 23 "-" "ELB-HealthChecker/2.0" "-"
10.168.166.132 - - [15/Mar/2019:04:14:24 +0000] "GET / HTTP/1.1" 403 23 "-" "ELB-HealthChecker/2.0" "-"
10.168.168.78 - - [15/Mar/2019:04:14:31 +0000] "GET / HTTP/1.1" 403 23 "-" "ELB-HealthChecker/2.0" "-"
10.168.166.132 - - [15/Mar/2019:04:14:39 +0000] "GET / HTTP/1.1" 403 23 "-" "ELB-HealthChecker/2.0" "-"
10.168.168.78 - - [15/Mar/2019:04:14:46 +0000] "GET / HTTP/1.1" 403 23 "-" "ELB-HealthChecker/2.0" "-"
10.168.166.132 - - [15/Mar/2019:04:14:54 +0000] "GET / HTTP/1.1" 403 23 "-" "ELB-HealthChecker/2.0" "-"
10.168.168.78 - - [15/Mar/2019:04:15:01 +0000] "GET / HTTP/1.1" 403 23 "-" "ELB-HealthChecker/2.0" "-"
10.168.166.132 - - [15/Mar/2019:04:15:09 +0000] "GET / HTTP/1.1" 403 23 "-" "ELB-HealthChecker/2.0" "-"
10.168.168.78 - - [15/Mar/2019:04:15:16 +0000] "GET / HTTP/1.1" 403 23 "-" "ELB-HealthChecker/2.0" "-"
10.168.166.132 - - [15/Mar/2019:04:15:24 +0000] "GET / HTTP/1.1" 403 23 "-" "ELB-HealthChecker/2.0" "-"
10.168.168.78 - - [15/Mar/2019:04:15:31 +0000] "GET / HTTP/1.1" 403 23 "-" "ELB-HealthChecker/2.0" "-"
10.168.166.132 - - [15/Mar/2019:04:15:39 +0000] "GET / HTTP/1.1" 403 23 "-" "ELB-HealthChecker/2.0" "-"
10.168.168.78 - - [15/Mar/2019:04:15:46 +0000] "GET / HTTP/1.1" 403 23 "-" "ELB-HealthChecker/2.0" "-"
10.168.166.132 - - [15/Mar/2019:04:15:54 +0000] "GET / HTTP/1.1" 403 23 "-" "ELB-HealthChecker/2.0" "-"
10.168.168.78 - - [15/Mar/2019:04:16:01 +0000] "GET / HTTP/1.1" 403 23 "-" "ELB-HealthChecker/2.0" "-"
10.168.166.132 - - [15/Mar/2019:04:16:09 +0000] "GET / HTTP/1.1" 403 23 "-" "ELB-HealthChecker/2.0" "-"

Now run the local parser, parser.py via the terminal.
The RegEx capturing group for the timestamp field must be named timestamp in the RegEx provided.
There must be a RegEx capturing group named uri which captures the uri of the logged event.
The RegEx must be in the Python format.

1
2
3
4
5
6
7
:param file: Name of log file to parse. Accepts wildcards.
:param app: Name of the application for the logs.
:param bucket: S3 bucket to store the parsed logs to, Ex: "my-bucket123"
:param timeformat: The format of the timestamp in the logs. Ex: 'DD/MMM/YYYY:HH:mm:ss ZZ'
Accepts the following tokens: https://pendulum.eustace.io/docs/#tokens
:param regex: Regex to use to parse the logs.
Ex: '(?P<remote_addr>[\S]+) - (?P<remote_user>[\S]+) \[(?P<timestamp>.+)\] "(?P<req_method>.+) (?P<uri>.+) (?P<httpver>.+)" (?P<status>[\S]+) (?P<body_bytes_sent>[\S]+) "(?P<referer>[\S]+)" "(?P<user_agent>[\S]+)" "(?P<x_forwarded_for>[\S]+)"'

Run the local parser

1
2
# inside the shadowreader directory
pip install -r requirements-local-parser.txt
1
2
3
python3 parser.py logs.txt --app app1 --bucket my-bucket \
--timeformat 'DD/MMM/YYYY:HH:mm:ss ZZ' \
--regex '(?P<remote_addr>[\S]+) - (?P<remote_user>[\S]+) \[(?P<timestamp>.+)\] "(?P<req_method>.+) (?P<uri>.+) (?P<httpver>.+)" (?P<status>[\S]+) (?P<body_bytes_sent>[\S]+) "(?P<referer>[\S]+)" "(?P<user_agent>[\S]+)" "(?P<x_forwarded_for>[\S]+)"'
1
2
3
4
Wildcard example for parsing multiple files
python3 parser.py /tmp/logs-2019* --app app1 --bucket my-bucket \
--timeformat 'DD/MMM/YYYY:HH:mm:ss ZZ' \
--regex '(?P<remote_addr>[\S]+) - (?P<remote_user>[\S]+) \[(?P<timestamp>.+)\] "(?P<req_method>.+) (?P<uri>.+) (?P<httpver>.+)" (?P<status>[\S]+) (?P<body_bytes_sent>[\S]+) "(?P<referer>[\S]+)" "(?P<user_agent>[\S]+)" "(?P<x_forwarded_for>[\S]+)"'

NOTE: The S3 bucket set in --bucket must be the same as the name of the deployed parsed_data_bucket in serverless.yml

You should see an output like below.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
5 minutes of traffic data was uploaded to S3.
Average requests/min: 6
Max requests/min: 8
Min requests/min: 2
Timezone found in logs: +00:00
To load test with these results, use the below parameters for the orchestrator in serverless.yml
==========================================
test_params: {
"base_url": "http://$your_base_url",
"rate": 100,
"replay_start_time": "2019-03-15T04:12",
"replay_end_time": "2019-03-15T04:16",
"identifier": "oss"
}
apps_to_test: ["app1"]
==========================================

Paste the test_params and apps_to_test into serverless.yml and follow the other guides to start the load test.