Skip to content

SpirentOrion/cip-edu

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 

Repository files navigation

Cloud & IP Engineering Educational Resources


Introduction

The cip-edu repository is a curated collection of subject areas that Cloud & IP engineers and managers can use for career development purposes.

This is not an exhaustive list of what engineers need to know in order to be successful at Cloud & IP. Nor do all engineers need to know all of this material. Really, it is just a list that people can use for inspiration or discovery.

You might notice a general lack of networking-related or STC product-related subjects. Those things are important, but we tend to always focus on them. Here we're putting a strong focus on subject areas that pertain to craftsmanship and implementation.

Subjects areas are loosely categorized, but that's subjective, so take those categories with a grain of salt. Where possible, specific resources (books, articles, podcasts, etc.) are linked.

How to Use

Scan this document. Cherry-pick areas and resources. Learn. Contribute back.

Please Contribute!

A list like this could never be complete. Please add to it and get your name on the contributors list! Or fill in some of the sparse spots below. You can do this by forking the repository, committing a change, and opening a pull request. Don't know what a pull request is or how they work? Here's a guide: https://www.thinkful.com/learn/github-pull-request-tutorial/.


Software Subject Areas

General Knowledge

POSIX systems programming fundamentals: Systems programming is one of those bedrock knowledge areas that every SW engineer should have. There are many good books and tutorials. Stevens' Advanced Programming in the UNIX Environment is classic. Also check out the Unix man pages themselves (sections 2 and 3).

UNIX system administration: Similar to systems programming, every SW engineer could benefit from knowing a bit about system administration. There are a lot of older (i.e. early 2000's) books on this topic which are good historical background. For more modern systems, check out the Debian Administrator's Handbook which is well-written and maintained. This is a good guide for Debian-based systems (which includes Ubuntu).

Virtualization fundamentals: The Wikipedia article on hypervisors is a great place to start and you can spider out from there. But there's no substitute for getting your hands dirty. You've probably already used a hosted hypervisor (VirtualBox, VMware Desktop, QEMU). If you've never played with a bare metal hypervisor, try KVM or grab a copy of ESXi (you can get a free copy for personal use by registering at VMware.com) and set it up on an old PC.

Containerization fundamentals: Namespaces and control groups form the basis of containerization, so it's worth knowing what those kernel subsystems can do. But what really made containerization take off in the industry were Docker's innovations around using a simple configuration file (the Dockerfile) and overlay/union filesystems to build container images that can be easily shared. That, and creative use of overlay networking to allow for service isolation. The Docker documentation is the best place to start. You can learn a lot by reading through the docs and trying some examples. It's easy to install Docker on modern Linux systems and you can also get Docker for Windows or Mac. But keep in mind that most examples and tutorials you'll find are written for Linux. Also, check out how top-tier open source projects approach containerization by looking at their Dockerfiles on hub.docker.com.

Code craft: Or, "how not to piss off your co-workers by writing shoddy code". It is a classic and a bit dated, but you can't really go wrong with Code Complete 2.

"What Every Programmer Should Know About Memory": This "paper" (it's more like a book) explains modern memory architectures. You may need to skim. It is a good companion piece for studying in conjunction with "Mechanical sympathy" or "Data-oriented design" below.

"What Every Computer Scientist Should Know About Floating-Point Arithmetic": This is a "real" CS paper, with quite a bit of math. Even if that's not your thing, you can skim this and still learn a great deal. Also, familiarize yourself with your language's support for arbitrary-precision numbers.

"Latency Numbers Every Programmer Should Know": This is another aged classic. Some of the absolute numbers are off now, but those were never the point. If you're interested in building performant systems, it's the ratios that matter.

Test-Driven Development: TDD arguments can turn into religious wars. Without getting into that, let's just say that if you're a SW engineer and you're not writing tests of some sort, then you're doing it wrong. The Wikipedia article on TDD will lead you to all the details, including variations like Behavior-Driven Development. You should at least know what the xUnit-style testing framework is for your programming language and how to use it.

Stuff You Probably Learned In School, Have Since Forgotten, But Really Matters

Data structures and complexity analysis: Every practicing SW engineer should have at least a working knowledge of the typical undergrad CS curriculum in this area. The classic reference is Introduction to Algorithms but The Algorithm Design Manual is more readable and even entertaining in spots. Bookmark the Big-O Reference. Also, if you can find a way to model your problem as a graph, then you can probably solve it using a graph algorithm.

Operating system fundamentals: This is the more theoretical side of "POSIX systems programming". Keep your old OS course textbook around to use as a reference or borrow someone's. [Editor's note: if anyone has a favorite reference, please add via a PR].

Concurrent programming models and primitives: This relates to OS fundamentals. You've got to know the difference between a process, thread, and green thread (among other things). But you also need to know what your language runtime offers in terms of concurrency models and primitives. It is difficult to find one reference that surveys the entire range of concurrent programming models. You could start by contrasting shared memory approaches vs message-passing approaches. Shared memory will tend to lead you into explorations of threads and synchronization primitives. Message passing will lead you to channels, actors, coroutines, and processes. Related, you should also look into immutability, and whatever support your language has for immutable variables and data structures.

Lexers and parsers: Why does this matter if you're not writing a compiler? Because of that old JWZ quote and SW engineers' tendency to want to hit every nail with a regex sledgehammer. You absolutely should know regular expressions (Mastering Regular Expressions), but knowing when not to use them, and being able to use a parser instead will separate you from the masses. If you have spare time, read Steve Yegge's epic "Rich Programmer Food" for inspiration. Depending on what language you're using, you may be able to find a good lexer/parser library and those docs may get you through. If you want a theoretical text, you could check out "The Dragon Book". It's the text everyone says to read but it's over-priced and if you try to read it then you'll find out why everyone hates it. A better alternative might be the first couple chapters of Engineering a Compiler. When it comes to practical guides, The Definitive ANTLR4 Reference is good, as is the ANTLR4 parser generator itself. If you know Golang, or are willing to learn a little bit of it, then Writing an Interpreter in Go will teach you what you need to know in bite-sized pieces. And then you'll be ready to reach for ANTLR4 or some other parser generator library.

Finite state machines: Most of us know from reading RFCs that state machines form the basis of many protocols. But they're useful for modeling all sorts of processes and computations -- especially those that have to interact with the outside world via I/O. We've probably all written or maintained code that's trying to manage this sort of thing ad-hoc. Without the rigor of state machines, how do you make this sort of code reliable and maintainable? Start with the Wikipedia article, and at least read through the "Classification" section. You can spider out from there. Depending on your language, you may be able to find libraries that help you implement state machines. But even if not, thinking about a problem in terms of an FSM, documenting the FSM, and coming up with any sort of code that mirrors the FSM would be a huge win compared to the ad hoc approach.

Basic statistics: Fundamentally, our products make measurements and these need to be processed, interpreted, and presented using statistics. SW engineers should at least understand measures of central tendency, statistical significance, and probability distributions. It's not necessary to know how to calculate all of these but you should know that they exist and what they mean. This is another area where it might be a good idea to keep that old college textbook around. [Editor's note: if anyone has a favorite reference, please add via a PR].

Basic queuing theory: Obviously queuing theory is fundamental to performance of network devices. The Wikipedia article is a good starting point. The classic text is Introduction to Queueing Theory, available for free online. This is "just math", so there are many more reference texts available.

Systems Thinking

"The Architecture of Open Source Applications": This is actually a series of books that explore how open source projects are structured, and why. There are large and small examples. Learning how other people approach system design is a great place to start.

"Fallacies of Distributed Systems": Most modern systems involve more than one CPU and are thus distributed systems. Assumptions that we might make on single-node systems no longer hold. Also check out the follow-up, "The Network is Reliable" which unpacks the top fallacy. These papers will give you an appreciation for all the things that can will go wrong.

"End-to-End Arguments in System Design". This is an important principle in system design. Eliminate low-level functions or subsystems that are redundant, unless their cost is justified by performance improvements.

Mechanical sympathy / Data-oriented design: Martin Thompson popularized the use of the term "mechanical sympathy" as applied to SW design. He has an entire blog about it and he's given many conference talks on it as well. The concept of "data-oriented design" is related. The best video on this comes from a talk that Mike Acton gave at CppCon 2014. This is a long video, so if you're in a hurry you can at least skip over the intro where Mike talks about how much fun the game development industry is. He starts talking about the title topic around 10:24.

"The Tail at Scale": Large-scale systems have many components that can introduce latency. Even one such component can dramatically impact a large portion of requests.

CAP Theorem: The CAP theorem is often described as "Consistency, availability, or partition-tolerance. Pick two." That's fun to throw around and leads to debates about whether "AP" or "CP" is better for this or that. But it is also a mis-characterization because you can't not pick "P". In a real-world system, network partitions will happen (see "Fallacies..." above). But the CAP theorem is still really interesting, and it spawned a lot of thinking that has led to extensions like "PACELC". Also check out Kyle Kingsbury's "Call Me Maybe" blog series where he absolutely destroys database vendors' claims re: consistency. Kyle also has some good conference videos.

"A Collection of Postmortems": This is Dan Luu's curated collection of postmortems (write ups that document system failures). It is interesting reading re: all the ways that real systems have failed.

"How Complex Systems Fail": This short paper comes from outside our industry (it was written by an MD!) so it reads a bit differently. But it is often quoted when it comes to reliability engineering. It's interesting to think about if you're dealing with system correctness when people are involved.

"Systems Performance: Enterprise and the Cloud": This is a huge book that is packed with good techniques for understanding and improving system performance. Written by Brendan Gregg, ex-Sun engineer, now at Netflix. Creator of latency/utilization heatmaps and flamegraphs. Also a DTrace expert.

Systematic Debugging

Systematic debugging: Dan Luu believes that systematic debugging can be taught. This is a good read. This is also perhaps an area where we can learn by reading anecdotes. Similar to the postmortem collection linked above, Dan also maintains a collection of debugging stories.

"Three Questions About Each Bug You Find": Just like with snakes, where there is one bug, there are probably more. This is a good reminder and some inspiration for making a bigger system impact when you're debugging.

Code benchmarking: It's not exactly debugging, but since we often have to deal with performance problems that manifest themselves as CRs, it is good to know what's available in your language for micro-benchmarking. Check out Google's benchmark library for C++, the Golang profiler, or Python's profile packages.

Technology-Specific

Golang: If you want to learn Go, start with the online tutorial. The standard packages are well-documented and at some point you should read "Effective Go" and "50 Shades of Go".

C (at least C99, up to C11): It covers ANSI C, which is older than C99, but "K&R" should be on every C programmer's shelf. The Wikipedia articles for C99 and C11 are good references for what is new in each of those specs since K&R.

C++ (at least C++03, up to C++14): The language specifications themselves are brutal. Check out The Definitive C++ Book Guide and List for a pile of resources that you can use to grok C++.

Design patterns: You probably already have seen "Gang of Four" but if not, you need to be familiar with this material in order to deal with STC's C++ codebase.

Debugging and profiling tools (gdb, gprof, valgrind, perf): TODO

X86 assembly language: TODO

ARM assembly language: TODO

Python scripting: TODO

Bash scripting: Google's Shell Style Guide has really useful advice, not the least of which is "Shell should only be used for small utilities or simple wrapper scripts". If you're working on shell scripts that will be committed to any repository anywhere, then you should also get yourself a copy of shellcheck.

Pros and cons of various encodings (JSON, YAML, XML, Protobuf): The book goes well beyond encodings, but Chapter 4 of Designing Data-Intensive Applications contains a good discussion about when and why you'd want to use certain encodings. The rest of the book is good too, and the author, Martin Kleppmann, has some great conference talks.

ZeroMQ: ZeroMQ is a networking library that goes well beyond basic BSD socket operations and also solves problems at higher layers of abstraction. Its guide is a "must read" if you're going to do any work with ZeroMQ.

Basic and intermediate use of Git: Git has a long tail in terms of tips and tricks, but you can come up to speed for basic usage with Pro Git, available for free online.

SCons internals: If you're extending our SCons build environment in any non-trivial way, then you need to understand builders, actions, and emitters.

Linux package management (.rpm and .deb): Linux packaging is a solved problem. Check out the RPM Packaging Guide for RHEL/Centos systems and the Debian Packaging Tutorial for Debian/Ubuntu systems.

Machine learning and deep learning: For novices to machine learning and deep learning, Andrew Ng's Coursera course on machine learning is a very good point to start. If you want to go further on deep learning after the lecture, the book Deep Learning is strongly recommended. If you want to learn specific tools, the tutorial sections of scikit-learn, nltk, TensorFlow, and TFlearn provide a lot of examples.

Firmware-Specific

Basics of the major Linux subsystems: TODO

DPDK: Most of the Spirent drivers are now DPDK (Data Plane Development Kit) based. This is a user space framework for poll mode drivers and other efficient APIs for managing data streams. It's written in 'C'. Originally started by Intel it only supported x86 but since then has been adopted by the wider community and supports other processor architectures. The online documentation is now fairly readable DPDK Programmers Guide.

Kernel Drivers: Anyone tackling a Linux Kernel driver must read Linux Device Drivers (O'Reilly publishing. Corbet, Rubini and Kroah-Hartman). It covers simple char drivers, debugging techniques, memory, interrupts, PCI and netdevices to name a few. Linux Device Drivers.

Web-Specific

Resource-based API design (REST APIs): Resource-based APIs are a different ball game compared to RPC-style interfaces. "REST" was discovered by Roy Fielding and is introduced in Chapter 5 of his PhD Thesis. This an academic description of an architectural style. As far as the real world goes, the devil is in the details. You can read about our opinionated choices in the Orion REST API Standards.

Distributed computing models and approaches: Designing Distributed Control Systems takes a pattern-based approach to reasoning about distributed systems. It won't keep you out of trouble but it will definitely give you a broad overview. Then you can reach for "Release It!" to help keep you out of trouble. You may have seen the "Glider Book" in the bookstore and thought that it had something to do with release engineering. It's nothing of the sort. The Second Edition was released in January 2018, de-emphasizing capacity management and updating for cloud deployments.

The log as a unifying abstraction: Think database transaction logs, not syslog-style logs, and not actually part of a database. Written by the Kafka architect, "The Log: What every software engineer should know about real-time data's unifying abstraction" is a worthwhile read for anyone designing a system that processes any kind of events.

Fundamentals of SSL/TLS: TODO

Relational databases: Schaum's series are a good beginner resource with large numbers of examples and solved problems: Fundamentals of Relational Databases and Fundamentals of SQL Programming. For further, in-depth reading, check out Fundamentals of Database Systems or Database Management Systems.

Data warehousing and star schemas: These warrant a separate set of resources. Start with The Star Schema and related articles in that blog.

Time-series/metrics databases (InfluxDB, Prometheus): TODO

Cloud platform basics (AWS, Azure, GCE, OpenStack): TODO

"Twelve Factor Apps": This document makes 12 recommendations for software that make applications easier to run as a service (i.e. in a cloud environment).

"Dapper, A Large Scale Distributed Systems Tracing Infrastructure": How can you effectively trace API requests in a distributed system? This paper describes how Google does it. Many open-source approaches have been inspired by this.

Soft skills

Time/task management and efficiency: TODO


Hardware Subject Areas

High Speed Serial IO Basics

"SerialIO": Free Book from Xilinx explaining the basics of high speed IO which are very important to understanding Layer 1 in High Speed Ethernet

Xilinx FPGA Design Methodology

"UltraFast Design Methodology": Manual from Xilinx explaning their UltraFast Design methodology

Xilinx Training Materials

"Xilinx Training": Link to Xilinx Training materials. Everything from manuals to videos and live online classrooms

System Verilog Resources


Contributors

  • Barry Andrews
  • Cliff Cordeiro
  • Haijun Deng
  • David Joyner
  • Rahul Patel
  • Matt Philpott
  • Timmons Player
  • Vasu Sankaran
  • Brian Silverman
  • Robert Bruck

About

Cloud & IP Engineering Educational Resources

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •