Lecture 03 2022
Lecture 03 2022
CS110L
Jan 10, 2022
Logistics
● Please make sure you’re on Slack & that you’ve filled out the intro form
(linked in last week’s slides)
● Week 1 exercises due tonight
● Slides posted on website before class
● Undergrad class: remote through next week
● Today: What is Rust’s “ownership model,” and how does it prevent common
memory errors?
○ Specifically focusing on memory leaks, double frees, and use-after frees
○ Thursday will show how Rust prevents other sorts of memory errors
Identifying Memory Errors
A Memory Exercise
● Thanks to Will Crichton for this exercise and for giving permission to use it in
this class!
● Discuss your answers to the exercise in groups (we'll assign you to different
breakout rooms in Zoom)
Aside: the language and the compiler
vec->data = new_data; // OOP: we forget to free the old data Element 3 Element 3
vec->capacity = new_capacity;
} Element 4 Element 4
int* n = &vec->data[0];
vec_push(vec, 110);
Wouldn’t it be nice if the compiler enforced that
printf("%d\n", *n); once free is called on a variable, that variable
can no longer be used?
free(vec->data);
vec_free(vec); // YIKES
}
Double free: a buffer is freed twice. (Sounds innocuous, but can actually lead to
Remote Code Execution: take CS 155)
(Here, we free(vec->data), and then call vec_free, which does the same thing)
Dangling Pointers
Stack
Vec* vec_new() {
Vec vec; main()
vec.data = NULL; Vec vec
vec.length = 0;
vec.capacity = 0; vec_new()
return &vec; // OOF
Vec vec <vectory stuff>
}
Dangling pointer: A pointer that is referencing memory that isn’t there anymore
(Here, vec points into the stack frame of vec_new, but as soon as vec_new returns, that
memory is gone)
Dangling Pointers
Stack
Vec* vec_new() {
Vec vec; main()
vec.data = NULL; Vec vec
vec.length = 0;
vec.capacity = 0; vec_push()
return &vec; // OOF
int new_capacity
} 💣
int new_data
…
Dangling pointer: A pointer that is referencing memory that isn’t there anymore
(Here, vec points into the stack frame of vec_new, but as soon as vec_new returns, that
memory is gone)
Wouldn’t it be nice if the compiler realized that vec “lives”
within those two curly braces and therefore its address
shouldn’t be returned from the function?
Iterator Invalidation
vec->data
void main() {
Vec* vec = vec_new();
vec_push(vec, 107);
Old buffer New buffer
int* n = &vec->data[0]; Element 1 Element 1
vec_push(vec, 110);
printf("%d\n", *n); // :( int *n Element 2 Element 2
💣 Element 3 Element 3
free(vec->data);
vec_free(vec); Element 4 Element 4
}
💣 Element 3 Element 3
free(vec->data);
vec_free(vec); Element 4 Element 4
}
while (frontIsClear()) {
repairColumn();
moveToNextColumn();
}
repairColumn();
● Many 106A students write repairColumn functions
that sometimes end with Karel facing south, and
other times end with Karel facing east; sometimes
with Karel at the top of the column and sometimes
with Karel at the bottom of the column; etc.
● Why is this bad?
What makes good code?
● Pre/postconditions are essential to breaking code into small pieces with well-
defined interfaces in between
○ We want to be able to reason about each small piece in isolation
○ Then, if we can verify that preconditions/postconditions are upheld in
isolation, we can string together a bunch of components and simply
verify that the preconditions/postconditions all fit together without
needing to keep the entire program in our heads
● It’s the programmer’s responsibility to make sure the pre/postconditions are
upheld
Good memory management
● In any complex program, you’ll allocate memory and pass it around the
codebase. Where should that memory be freed?
● If you free too early, other parts of your code might still be using
pointers to that memory
● If you don’t free anywhere (or you free in a function that only gets called
sometimes), you’ll have a memory leak
● Good C/C++ code will clearly define how memory is passed around and
“who” is responsible for cleaning it up
● If you read C/C++ code, you’ll see notions of “ownership” in the comments,
where the “owner” is responsible for the memory
/* Get status of the virtual port (ex. tunnel, patch).
*
* Returns '0' if 'port' is not a virtual port or has no errors.
* Otherwise, stores the error string in '*errp' and returns positive errno
* value. The caller is responsible for freeing '*errp' (with free()).
*
* This function may be a null pointer if the ofproto implementation does
* not support any virtual ports or their states.
*/
int (*vport_get_status)(const struct ofport *port, char **errp);
Open vSwitch
/**
* @note Any old dictionary present is discarded and replaced with a copy of the new one. The
* caller still owns val is and responsible for freeing it.
*/
int av_opt_set_dict_val(void *obj, const char *name, const AVDictionary *val, int search_flags);
ffmpeg
/**
* iscsi_boot_create_target() - create boot target sysfs dir
* @boot_kset: boot kset
* @index: the target id
* @data: driver specific data for target
* @show: attr show function
* @is_visible: attr visibility function
* @release: release function
*
* Note: The boot sysfs lib will free the data passed in for the caller
* when all refs to the target kobject have been released.
*/
struct iscsi_boot_kobj *
iscsi_boot_create_target(struct iscsi_boot_kset *boot_kset, int index,
void *data,
ssize_t (*show) (void *data, int type, char *buf),
umode_t (*is_visible) (void *data, int type),
void (*release) (void *data))
{
return iscsi_boot_create_kobj(boot_kset, &iscsi_boot_target_attr_group,
"target%d", index, data, show, is_visible,
release);
}
EXPORT_SYMBOL_GPL(iscsi_boot_create_target);
Linux kernel
Sometimes, custom cleanup functions must be used to free memory. Calling free() on this
memory would be a bug!
Open vSwitch
Sometimes, custom cleanup functions must be used to free memory. Calling free() on this
memory would be a bug!
/**
* dvb_unregister_frontend() - Unregisters a DVB frontend
*
* @fe: pointer to &struct dvb_frontend
*
* Stops the frontend kthread, calls dvb_unregister_device() and frees the
* private frontend data allocated by dvb_register_frontend().
*
* NOTE: This function doesn't frees the memory allocated by the demod,
* by the SEC driver and by the tuner. In order to free it, an explicit call to
* dvb_frontend_detach() is needed, after calling this function.
*/
int dvb_unregister_frontend(struct dvb_frontend *fe);
Linux kernel
Ownership can sometimes get extremely complicated, where one part of the codebase is
responsible for freeing part of a data structure and a different part of the codebase is
responsible for freeing a different part
// lhmslv_free will free the keys: we only need to free the void-star values.
for (lhmslve_t* pa = pstate->pcounts_by_group->phead; pa != NULL; pa = pa->pnext) {
unsigned long long* pcount = pa->pvvalue;
free(pcount);
}
lhmslv_free(pstate->pcounts_by_group);
...
}
Miller
Pre/postconditions must be consistently upheld
● It’s up to the programmer to make sure to get this right. If you don’t uphold
the interface, your program is broken
○ Consequences: anything from denial of service (e.g. memory leak) to
remote code execution (e.g. double free, use-after free, buffer overflow)
● The compiler cannot help you out
○ Static analyzers can help sometimes, but not always (see week 1
exercises)
● Key point: compiler does not know what your postconditions are, because
it’s not possible to express in the C language
Type systems
● The types of a programming language are the nouns of a spoken language
○ When you talk, what do you talk about?
● C type system: numbers, pointers, structs… not much else
○ Extremely simple: can learn most of the C language in half a quarter of CS 107
○ Simple != easy
● The pre/postconditions may be written in comments, but they are not present in the actual code,
because the C language does not have a way for them to be expressed
● Consequently: the compiler is unaware of what you’re trying to do
Are there better type systems that we can use to specify
our preconditions/postconditions in the code?
(implication: if the compiler can understand your pre/postconditions, it can verify that they are met)
(Meet Rust 🦀 )
What if Ownership lived in the
programming language?
Ownership Visualized
fn main() {
let julio = Bear::get();
// play with bear
}
Ownership Visualized
my_cool_bear_function(/* parameter */ ) {
// Do stuff
}
● (Can think of) `parameter` as
● a new local variable in this function,
● …which now owns the data
● …and is responsible for cleaning it up when it
goes out of scope (when function returns)
● Once data is cleaned up, it can no longer be
used (e.g., by variable `julio` in original function.
● In teddy bear terms: `julio` gives bear to someone else, that
person “goes home” at the end of the function, and that
person is responsible for putting the bear away. After bear is
put away, `julio` can no longer play with the bear.
How will I ever decompose code????
Borrowing
Hey, let julio = Bear::get(); Thank
my_cool_bear_function, you, this means
you could BORROW this toy. my_cool_bear_function(&julio) you'll have to put the
Just give it back when you're /* The julio variable can still be used here toy back when you're
done! done though!
to access the teddy bear! */