Is Rust the next gen C/C++? let's get some readings and keep notes...

Why? Rust?

Performance🐾 and Security🐟, I want both!

Perform like C++ yet "secure" like a script lang. Talks like a cool script lang yet powerful enough to go deep down to system level.

As a more secure and yet efficient lang, Rust recently catches my eyes and some people even believe it can replace (at least somewhat) the long-live C++ for game dev. Although lang itself is not the one to blame for messy/buggy C++ game code (e.g. Cyberpunk 2077), I still wish there is a cleaner/stricter language to branch from C++ (instead of splicing tons of historical/back-compatible features) so that developer doesn't need to create their own code standard, wheels, memory model etc. Junior developers under C++ tends to depise such rules which potentially leading to hard-to-find bugs like dangling pointer, memory tampering, multithread resource sharing with unclear ownership etc.. Rust uses stricter compiling check (historically strict ownership) to make sure a more reliable run time code, which means like a good human language teacher who requires you to elaborate your every sentence (and it might slow you down at first). A lot of system level cli app I used are written by Rust. After my reading, I believe knowing rust can help foster lots of good C++ habits that C++ junior won't have when first diving to C++ project. Learning Rust could be the fastest way to know how to be a better C++ coder. Besides, like other mainstream langs, Rust has friendly pkg manager 👍(cargo and crates.io). It's definitely another level to old C/C++ pkg eco-sys. Enthusiastic people even consider using it to build safer OS! In web world, Rust is also modern and mutually first class to webassembly, if you believe wasm is the future web.

Rust Fast

In my pov, Rust does the right balance between safety and efficiency. C++ can do that but requires trained and discipline programmers. The question will be down to whether extra 2% perf gain worth the trade-off. Besides how about pkg manager? Does C++ give you that by default?

Rust Popular

Cheatsheet for C++ Programmer

I'm a C++ programmer, so this is for myself to revisit when forget:

  • & is stack ptr but with safer read/write + null guarantee (reference_wrapper) and can be unsafely rawed to C++ ptr
  • borrowing means better referencing (respect ownership). passing value == giveup ownership unless you has trait copy/movable (e.g. primitive type, &T, struct); passing ref only allows multiple immutable or 1 mutable
  • copy is never deep (== c++ default assignment) unless you explicitly do clone
  • str == C++ char[] + better boundary (than 0); String == C++ std::string; [T;usize] == C++ T[]; Vec<T> == C++ std::vector;
  • &str is the common arguments for String and str, and similar for other pairs like [] & Vec
  • const == C++ macro or const; static == C++ static
  • smart pointers is referece with ownership more like C++. see table from RIA books
  • smart pointers trait DeRef/Drop
  • homework or before you consider using it as an interview lang: implement simple link list using Rust (ref https://rust-unofficial.github.io/too-many-lists/second-option.html)

The Official Book (⭐⭐⭐⭐⭐)

Nice example based learning see here

Data Types

i8, i16, i32, i64, i128, isize
u8, u16, u32, u64, u128, usize
f32, f64
char (unicode 4 bytes) //b"..." is bytes[u8] and only ascii "..." is str type
bool
() empty tuple
struct Tensor(i32, i32) //tuple
struct Point{
x: i32, y: i32
}
enum PointType {
    OneDimension, // default 0 but can be =1
    TwoDimension(Ojb),
    ThreeDimension,
} //enum

[T; length] //array
const COMPILE_TIME_VALUE : i32 = 10;
static STATIC_VALUE : &str = 'Unsafe'; // unsafe if you mut it
let mut mutable = 1; //can change but you can let mutable = mutable; to unmut it 😄
  • Immutable variable vs const: const is compile time (c++ const) while immutable is for runtime. You can mut immutable vars.
  • You can shadow immutable vars but shadowing different type cannot happens
  • arch type like isize and usize is depending on pc arch used
  • overflow handling is suggested as wrapping* (mod) , checked (-> None if overflow) overflowing_ (->(val, boolean)) and saturating_ (clamp at min/max) ( can be add/substract etc.) See rust makes junior aware these issues instead of hiding like c/c++!!!
  • char is 4byte
  • tuple let a = (1,2.0,'a') and a.0 is 1 and destructing like let (x, y, z) = a
  • arrays is familiar c++/c flavor, compile time length on stack let a = [3; 5]. Out of index is protected (no invalid memory access) by rust in run time.

Functions

  • snake_case convention.
  • statement doesn't return: let x = let y = 5 or let x = y = 5 is invalid. See rust kill these brain teaser statement👍.

Flow

  • if condition must be bool. Because a lot of implicit bool in other lang brings a lot of monsters. as expression let x = if con {1} else {2}; is valid as long as type is the same
  • loop with break as expression let x = loop { ... break 2 };
  • for n in nums.iter() or for n in (1..4)
for x = 1..=n {
    ...
}
match number {
    1 => println!("One!"),
    2 | 3 | 5 | 7 | 11 => println!("This is a prime"),
    13..=19 => println!("A teen"),
    _ => println!("Ain't special"),
}

Ownership

Unique features to enforce memory allocation awareness and foster good habits. Rust want to be as powerful as c/c++ to take advantage of different memory models!

  • scope c/c++ similar RAII
  • str expression e.g. "hell,world" -> c style static/global; str -> c style string in stack; String -> c++ std::string in heap
  • all primitives on stack follow the similar pattern because they have Copy trait. When you do assignment Rust will never automatically create “deep” copies. Rust choose to transfer ownership and invalidate the previous owner unless you impl Copy.
  • let s = "hello" and s is &str with Copy trait so it can copy/assignment but it points to the same stack location. However, fixed len array is not reusing memory, instead it just Copy around like struct.
  • Tuple has copy if all its field has
  • mutable references have one big restriction: you can have only one mutable reference to a particular piece of data in a particular scope and we cannot borrow mutable ref if an immutable ref was created and used later; to sum up, as long as there's mutable ref, other immutable ref shouldn't exist!!!
  • let slice = &s[0..5] , &s[..5] or &[2..] to refer to chunk of bytes

Struct & Enum

  • let user2 = User { email, ..user };
  • reference field will be checked to force you specify lifetime
  • [#derive(Debug)] macros annotates print gen for {:?}
  • [#derive(PartialEq)]: == (all field eq) and != gen; Eq is PartialEq with one more check (a == a) so usually is marker behind like [#derive(PartialEq, Eq)]; HashMap<K, V> requires Eq
  • [#derive(PartialOrd)]: require PartialEq implemented, partial_cmp -> Option<Ordering> (None, Less, Equal, Greater)
  • [#derive(Clone, Copy)]: clone requires all implement clone, copy is fastly shallow copy on stack with (copy requires clone)
  • [#derive(Hash)]: hash get hash value
  • [#derive(Default)]: default make default values required by ..._or_default functions
  • enum Option<T> { Some(T), None } and its functions: is_none, as_ref, as_mut, unwrap(_or), map(_or), ok(_or)->Result<T, E> ...

Mod

  • mod == namespace
  • pub == public or export
  • wildcard '*' and 'as'

Collections

Vec

  • std::vec::Vec
  • let v:Vec<i32> = Vec::new(); or vec![1, 2, 3] vs arrays let a:[u8;5] = [1;5];
  • immutable borrow will fail the mutable borrow
  • fixed len: [i32;3] vs slices: &[usize] vs vec: Vec<i32>, use slice as function parameters is recommended
  • for x in v.iter() or &v
  • common methods: len, isempty, push/pop, append/split_off, dedup, clear, reserve/with_capacity, insert/remove, resize, splice, as((mut_)slice/ptr), iter/&, get/[]
  • Deref methods for binary search etc. like c++ container common

String

  • std::string::String
  • String vs str and Rust also has the real things: CString vs CStr
  • String::from("...") vs "...".to_string
  • Since the interexchange of 2 types usually happens, it's recommended to make function arguments as slices: &str
  • common methods for both String/str: len, push(str)/pop, from(_utf8_lossy), insert(_str)/remove, clear, is_empty, split(_off), as(into)(bytes/str), replace_range, find

HashMap

  • std::collections::HashMap
  • Key Eq trait is requied
  • commond methods: insert/remove, get, entry(.or_insert), contains_key
  • default rust hash is encryptically secure for some attack but slow, to perform better bring your own hasher

BTreeMap

Error Handling

enum Result<T, E> {
    Ok(T),
    Err(E),
}
  • panic!, Result (unwrap/expect to panic) or ? (return/propagate error Result)

Generic & Lifetime

pub trait Summary {
  pub fn summarize(&self) -> String {
    ...
  }
}
impl Summary for Book {
  ...
}
impl Summary for Article {
  ...
}
  • equivalent to C++ template
  • trait == interface
  • fn notify(& impl Summary) == fn notify<T:Summary>(&T)
  • use + for multiple traits: fn notify(& impl Summary + Display)
  • use where to clearer signature
  • impl<T:SomeTrait> Summary<T> for T to define generic impl (blanked impl)
  • lifetime will be smaller of your variables associated e.g. 'a
  • elision rule: 1. each param has lifetime var 2. if one param return is the same 3. else if one param is &self/&mut self, &self lifetime is assigned else compiler fails to figure out and error!
  • 'static life time is global/static

Hands On!

Rust In Action (MEAP) (⭐⭐⭐⭐) Rust Cook Book Rust Dark Art

Expected to find some lang system level explanation (e.g. virtual function table, garbage collector mechanism) instead of common lang features. Also multi-threading is great topic if it can cover.

Intros

Number Type

  • Conversion is explicit through as
  • Each type has its method like 24.5_f32.round() is method to round 24.5_f32 to 25.
  • Format type "{:b}{:o}{:x}".
  • Float comparison is tolerate with e.g. f32::EPSILON.
  • num crate extend primitive types to support e.g. complex, big integer/infinite float, fixed point currency decimal

Iteration

  • for x in 0..10
  • for x in collection.iter() or &collection
  • iter_mut is mutable while into_iter is value typed

Flow

  • loop { if ... break }
  • match 0..=40 is range of [0, 40], | is or and _ is otherwise

Function Signature

  • fn lifetime<'a, 'b>(x: &'a i32, y: &' i32) -> i32
  • fn add<T: std::ops::Add<Output=T> + std::ops::Multiply >
  • str -> c style stack/static ; String -> std::string heap and I guess primitive follow the same pattern vs Box.
  • Arrays -> c style Slice -> new/malloc Vector -> c++ vec
  • #![allow(unused_variables)] #![allow(dead_code)] and unimplemnted! macro

Struct

  • tuple Example(i32,...) is anonymous struct and reference element by .0 .1 .2 ...
  • name: String == [u8; name.size ] and data: Vec<u8> == [u8; data.size] and they are most possibly on heap
  • impl, self and return (default last one, note ;)

Implementation

  • unsafe block is used when check global in system level programming, and static globals are all capital
  • panic! to throw exception
  • const vs let: const is like c/c++ const during compile time and define fixed array while let is runtime. let is more of shared reference and it can be shadowed and convert to mut.
  • Result<T, String> -> Ok(T) or Err(String) and unwrap()
  • trait ... fn(self &Self ...) and impl SomeTrait for YourClass. It's like interface and implements/extend
  • #[derive(Debug)] is for auto trait generation and format as {:?}
  • pub as export, pub module can be pub(crate) in crate, pub(super) to parent pub(in path) to modules in path
  • //! for modules/files /// for next code and rustdoc to render the documents

Lifetime, Ownership & Borrowing

  • Copy trait is not implement unless primitive type. Assigning value leads ownership transfer. Not convenient but this avoid c/c++ copy/move constructor assumption issue (make things clear).
  • Drop is extra customized destructor traits that run with default one in case you want to do some extras
  • Ownership moves through: assignment including function arguments/returns. Tracing these ownership relief garbage collecting burden.
  • Copy vs Clone: Copy happens at assignment or ownership transfer, all primitive type implements this; Clone is clone trait. You can use compiler to generate bit by bit (shallow) copy/clone #[derive(Debug,Clone,Copy)].
  • use std::rc::Rc is std::shared_ptr, Rc is immutable, however RefCell<T> can wrap it to make struct internal mutable via .borrow_mut(). Note these 2 are not threadsafe, while Arc<Mutex<T>> is.
  • raw memory interprets: unsafe { std::mem::transmute(a) };

Integer

  • Integer overflow, big/little endian, f32 ieee format and Q7 format (new to me). In these case, try to use function like checked*(add) or overflow*
  • bit op: 1u32 << 1
  • mod similar to namespace, mod.rs is for folder.
  • #[cfg(test)] and cargo test

Pointer vs Reference

  • Reference is safe; pointer is unsafe: raw pointer it can be null and no check; smart pointer
  • Box like java is used to box primitive types. Make them of struct flavor
  • Raw pointer: let ptr = &obj as const u8 as mut u8 // &reference cannot be cast directly to mut 🙂
  • Cow copy on write smart pointer, if read, no copy!
  • all pointers listed (note not applicable means not directly used) as below:
NameUsageProsCons
Raw Pointer&obj as const u8 as mut u8speed/acess other systemunsafe and not smart🙂
Box<T>Box<T>::new(..)== c++ unique_ptrsize overhead
Rc<T>Rc<T>::new(..)== c++ shared_ptrsize/runtime cost/thread unsafe
Arc<T>ARc<T>::new(..)thread safe Rc == c++ atomic_*same as Rc
Weak<T>rc::Weak::new(..) arc::Weak::new(..)==c++ weak_ptr/thread safesize/cost
Cell<T>Cell<T>::new(..)unmutify objsize/cost
Ref(Mut)<T>Ref(Mut)<T>::new(..)&(mut)'s runtime versionsize/cost
RefCell<T>RefCell<T>::new(..)unmut & combine with Rc/Arcsize/cost
Cow<T>Cow:from(..)avoid clone if readsize
StringString::from(str)resizable/encoding checksize/cost
Vec<T>Vec<T>::new([])resizeablesize/cost
RawVec<T>RawVec<T, A: Alloc>::new([])resizeable/allocatorbehind Vec & VecDeq, not applicable in code
Unique<T>Unique<T>::new(..)used by String, Vec etc.not applicable in code
Shared<T>Shared<T>::new(..)used by Rc, Arc etc.not applicable in code
UnsafeCell<T>?unsafenot applicable in code
  • AsRef<T> and AsMut<T> to support low cost similar to Borrow, need to pay attention to Borrow Cow ToOwned and AsRef!
  • Box:new, * unary and drop -> C++ new/delete
  • Trace heap: timeout 20 ltrace -T -o trace.txt -e -*+malloc+free+realloc ./target/debug/particles
  • Rust can access low level system api like c/c++, e.g. winapi, kernel32

File & Storage

  • serde -> serialize/deserialize -> support json, cbor, bincode, with annotation e.g. #[derive(Serialize, Deserialize)]
  • let a = br#"...."#; //multiline string format as bytes
  • unwrap->expect("error message")
  • OpenOptions::new().read(true).write(true).create(true)... Java flavor builder for options
  • Path and PathBuf
  • #[cfg(target_os="linux")] -> ifdef LINUX
  • ? for std Result handling, Ok-> unwrap Err-> return
  • Riak and Bitcask db (guarantee never lost data)
Fixed-width header      Variable-length body
-----------------    --------------------------
/                \ /                            \
+=====+=====+=====+====== - - +============= - - +
| u32 | u32 | u32 | [u8]      | [u8]             |
+=====+=====+=====+====== - - +============= - - +
chksum klen  vlen   key         value
  • HashMap .insert []==.index() .get()
  • json!("{..json..format}") from serde_json

Networking

  • reqwest::get(url)->Result<(), Box<dyn std::error::Error>>
  • traits/abstract

impl to attach fn

impl Abstract {
    fn ...
}

type conversion

Require impl the traits if you convert e.g. &str -> String

impl From<i32> for Number { //TryFrom is the fallible equivalent
    fn from(s: i32) -> Self {
        Number { value: s }
    }
}
...