Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
94 changes: 80 additions & 14 deletions compiler/rustc_expand/src/expand.rs
Original file line number Diff line number Diff line change
Expand Up @@ -483,6 +483,7 @@ fn get_param_ident(pat: &Box<Pat>) -> String {
}
}

/// MDE: Does this take an arbitrary Rust type, or only primitives and strings?
// Given a Rust type, return its Daikon rep-type.
// E.g.,
// i8 -> int
Expand All @@ -505,6 +506,9 @@ fn get_prim_rep_type(ty_str: &str) -> String {
|| ty_str == U128
|| ty_str == USIZE
{
/// MDE: Minor: It seems wasteful to create new strings repeatedly
/// rather than (say) returning a &str view into a string that is only
/// ever created once.
return String::from("int");
} else if ty_str == F32 || ty_str == F64 {
return String::from("");
Expand All @@ -520,8 +524,12 @@ fn get_prim_rep_type(ty_str: &str) -> String {
String::from("")
}

/// MDE: Does arguments mean the values that are in it? Or the type parameter? Or something else?
// Given the arguments to a Vec or array, return a RepType
// enum representing the Vec/array.
/// MDE: `grok` is an uninformative name. I think that all the `grok_*`
/// functions do instrumentation, and if that is the case, then `instrument_`
/// would convey more to the reader.
fn grok_vec_args(path: &Path) -> RepType {
let mut is_ref = false;
match &path.segments[path.segments.len() - 1].args {
Expand All @@ -547,7 +555,9 @@ fn grok_vec_args(path: &Path) -> RepType {
}
}

// Capable of representing the rep-type of a Rust type.
/// MDE: I'm curious why the functions above return a string rather than a RepType.
/// As a general rule, passing around strings that will later be parsed is a code smell.
// Represents the Daikon rep-type of a Rust type.
// String payload represents the corresponding "Java" type
// i32 -> Prim("int")
// &[i32] -> PrimArray("int")
Expand All @@ -561,6 +571,9 @@ enum RepType {
HashCodeStruct(String),
}

/// MDE: "with is_ref" is confusing. Say "by side-effecting" or "by setting".
/// MDE: This returns a rep type and sets variable is_ref. Why isn't is_ref a
/// field in the RepType struct?
// Given a Rust type kind, return its RepType. Also note whether the type
// is a reference with is_ref.
fn get_rep_type(kind: &TyKind, is_ref: &mut bool) -> RepType {
Expand All @@ -581,6 +594,7 @@ fn get_rep_type(kind: &TyKind, is_ref: &mut bool) -> RepType {
return get_rep_type(&mut_ty.ty.kind, is_ref);
}
TyKind::Path(_, path) => {
/// MDE: Minor: ".is_empty()" is more idiomatic than ".len() == 0".
if path.segments.len() == 0 {
panic!("Path has no type");
}
Expand All @@ -590,6 +604,7 @@ fn get_rep_type(kind: &TyKind, is_ref: &mut bool) -> RepType {
return RepType::Prim(maybe_prim_rep);
}
if ty_string == VEC {
/// MDE: What remains to be done here?
// FIXME
return grok_vec_args(&path);
}
Expand All @@ -609,6 +624,9 @@ fn map_params(decl: &Box<FnDecl>) -> HashMap<String, i32> {
res
}

/// MDE: make the naming more descriptive. Rather than saying "a map data
/// structure", say what it maps to and from. Rather than "DeclsHashMap",
/// indicate its type or use. Also make the field name "map" more informative.
// Immutable visitor to visit all structs and build a map data structure.
// FIXME: remove, we will use a /tmp file instead.
#[allow(rustc::default_hash_types)]
Expand All @@ -634,23 +652,37 @@ impl<'a> Visitor<'a> for DeclsHashMapBuilder<'a> {
}
}

// Main struct for walking functions to write the decls file.
// map allows for quick retrieval of struct fields when a struct
// parameter is encountered.
// depth_limit tells us when to stop writing decls for recursive structs.
// Struct whose methods walk C functions to write the decls file.
/// MDE: The (informal) Rust standard is to document formal parameters using the
/// following syntax. Note leading blank line, then asterisk, backticks, and
/// hyphen per line. Please rewrite all the comments to conform to this syntax.
//
// * `map` - Allows for quick retrieval of struct fields when a struct
// parameter is encountered.
// * `depth_limit` - Tells us when to stop writing decls for recursive structs.
#[allow(rustc::default_hash_types)]
struct DaikonDeclsVisitor<'a> {
pub map: &'a HashMap<String, Box<Item>>,
pub depth_limit: u32,
}

// Represents a parameter or return value which must be written to decls.
/// MDE: what is the purpose of this map? Also, throughout, use more
/// descriptive variable names than "map".
// map: map from String to struct definition with field declarations.
/// MDE: Here and elsewhere, you need to document the fact that this value
/// can also be "false", and under what circumstances. I suspect it would be
/// more idomatic to make var_name an Option type rather than having a special
/// string value.
// var_name: parameter name, or "return" for return values.
// dec_type: Declared type of the value (dec-type for Daikon).
// rep_type: Rep type of the value (rep-type for Daikon).
/// MDE: "key" is not very descriptive. Could you use "struct_name" instead?
// key: If the value is a struct, contains the struct type name for lookup,
// otherwise None.
/// MDE: Minor: It might simplify the code to make field_decls an empty array if
/// the value is not a struct. Then there is no need to unwrap the Option
/// wrapper.
// field_decls: If the value is a struct, represents decl records for the
// fields of this struct.
// contents: If the value is Vec or array, a decls record for the contents
Expand All @@ -667,9 +699,11 @@ struct TopLevlDecl<'a> {
pub contents: Option<ArrayContents<'a>>,
}

// Represents a field decl of a struct at some arb. depth.
// Represents a field decl of a struct at some arbitrary depth.
/// MDE: Is "identifier" a type name, a variable name, or something else?
// enclosing_var: the identifier of the struct which contains this field.
// field_name: name of this field.
/// MDE: What other fields does the below comment refer to?
// See TopLevlDecl for other fields.
#[allow(rustc::default_hash_types)]
struct FieldDecl<'a> {
Expand Down Expand Up @@ -723,6 +757,7 @@ impl<'a> ArrayContents<'a> {
}
}

/// MDE: Throughout, use a more descriptive name than "key".
// If the top-level variable is an array of structs, use our key to fetch field definitions
// of our struct type.
fn get_fields(&self, do_write: &mut bool) -> ThinVec<FieldDef> {
Expand All @@ -749,8 +784,14 @@ impl<'a> ArrayContents<'a> {
}
}

/// MDE: "If var is an array"?
// If var array of structs, recursively populate sub_contents by creating
// a new ArrayContents for each field.
/// MDE: I am confused about `do_write`. I think it means "should write",
/// and should_write could be a better name. Ah, "do_write" is probably by
/// contrast with "dont_write" (which is clear enough), but do_write is
/// still unnecessarily confusing.
/// MDE: Use a more descriptive name than "the tmp file".
// do_write: I think this was a hack for avoiding structs/enums/unions which did
// not belong to the crate. That is again an ongoing issue with the /tmp
// file we need to create in the first pass.
Expand Down Expand Up @@ -849,6 +890,9 @@ impl<'a> ArrayContents<'a> {
}

impl<'a> FieldDecl<'a> {
/// MDE: This code is very similar to that immediately above. Can they be
/// abstracted so that each calls the same helper routine, which does most
/// of the work?
// Write this entire FieldDecl to the decls file.
fn write(&mut self) {
match &mut *DECLS.lock().unwrap() {
Expand Down Expand Up @@ -884,6 +928,8 @@ impl<'a> FieldDecl<'a> {
}
}

/// MDE: What does it mean to "build up"? Where are the new fields stored?
/// Please improve the documentation and the function name.
// If this FieldDecl represents a struct field, recursively build up our field_decls by
// creating a new FieldDecl for each field.
fn build_fields(&mut self, depth_limit: u32, do_write: &mut bool) {
Expand Down Expand Up @@ -932,6 +978,8 @@ impl<'a> TopLevlDecl<'a> {
// recursively build declarations for the fields.
fn build_fields(&mut self, depth_limit: u32, do_write: &mut bool) {
if depth_limit == 0 {
/// MDE: I think "invalidate ourselves" means to set do_write to false.
/// But "do_write" and "invalidate" are not defined.
// Invalidate ourselves for writing? Or will writing stop too...
return;
}
Expand Down Expand Up @@ -1049,6 +1097,7 @@ impl<'a> TopLevlDecl<'a> {
field_name: field_name.clone(),
};
match &mut tmp.decl.contents {
/// MDE: This is not an informative string.
None => panic!(""),
Some(contents) => {
contents.build_contents(depth_limit - 1, &mut do_write);
Expand Down Expand Up @@ -1110,7 +1159,7 @@ impl<'a> TopLevlDecl<'a> {
}
}

// Helper to write function entries into the decls file.
// Write function entries into the decls file.
fn write_entry(ppt_name: &str) {
match &mut *DECLS.lock().unwrap() {
None => panic!("Cannot access decls"),
Expand All @@ -1121,7 +1170,7 @@ fn write_entry(ppt_name: &str) {
}
}

// Helper to write function exits into the decls file.
// Write function exits into the decls file.
fn write_exit(ppt_name: &str, exit_counter: usize) {
match &mut *DECLS.lock().unwrap() {
None => panic!("Cannot access decls"),
Expand All @@ -1132,7 +1181,7 @@ fn write_exit(ppt_name: &str, exit_counter: usize) {
}
}

// Helper to add a newline in the decls file.
// Add a newline in the decls file.
fn write_newline() {
match &mut *DECLS.lock().unwrap() {
None => panic!("Cannot access decls"),
Expand All @@ -1142,7 +1191,7 @@ fn write_newline() {
}
}

// Helper to write metadata header into the decls file.
// Write metadata header into the decls file.
fn write_header() {
match &mut *DECLS.lock().unwrap() {
None => panic!("Cannot access decls"),
Expand All @@ -1155,6 +1204,7 @@ fn write_header() {
}

impl<'a> DaikonDeclsVisitor<'a> {
/// MDE: And do what if they are found?
// Walk an if expression looking for returns.
// See rustc_parse::parser::item::grok_expr_for_if.
#[allow(rustc::default_hash_types)]
Expand All @@ -1178,6 +1228,7 @@ impl<'a> DaikonDeclsVisitor<'a> {
exit_counter,
);
}
// I suspect that `if_block` should be `then_block`.
ExprKind::If(_, if_block, None) => {
self.grok_block(
ppt_name,
Expand All @@ -1188,7 +1239,11 @@ impl<'a> DaikonDeclsVisitor<'a> {
exit_counter,
);
}
// I suspect that `another_block` should be `else_block`.
ExprKind::If(_, if_block, Some(another_expr)) => {
/// MDE: This duplicates code above for handling "if". Have one
/// match for "If" that first calls this, then matches on the
/// else block to possibly operate on it.
self.grok_block(
ppt_name,
if_block,
Expand All @@ -1206,10 +1261,12 @@ impl<'a> DaikonDeclsVisitor<'a> {
&ret_ty,
);
}
/// MDE: How do you know that the problem was an if statement with an else?
_ => panic!("Internal error handling if stmt with else!"),
}
}

/// MDE: What does "identify" mean? Is it "find"? Once found, what is done with it?
// Process an entire stmt to identify an exit point or recurse on blocks.
// See rustc_parse::parser::item::grok_stmt.
#[allow(rustc::default_hash_types)]
Expand Down Expand Up @@ -1430,6 +1487,7 @@ impl<'a> DaikonDeclsVisitor<'a> {
}),
};
match &mut tmp.contents {
/// MDE: This is not an informative string.
None => panic!(""),
Some(contents) => {
contents.build_contents(
Expand Down Expand Up @@ -1491,6 +1549,7 @@ impl<'a> DaikonDeclsVisitor<'a> {
i
}

/// MDE: What is done once one is found?
// Walk a new block looking for exit points and nested blocks.
// See rustc_parse::parser::item::grok_block.
#[allow(rustc::default_hash_types)]
Expand Down Expand Up @@ -1520,9 +1579,9 @@ impl<'a> DaikonDeclsVisitor<'a> {
}
}

// is it a good idea to store which params are valid at each exit
// Is it a good idea to store which params are valid at each exit
// ppt for the decls pass which happens after this?
// then the decls pass just needs to:
// Then the decls pass just needs to:
// 1: visit_item to build HashMap<ident, StructNode>.
// 2: visit_fn, grok sig, and grok exit ppts using structural
// recursion on StructNodes for nesting. Need to use depth counter
Expand Down Expand Up @@ -1558,10 +1617,13 @@ impl<'a> DaikonDeclsVisitor<'a> {
}
}

// Process a function signature and build up a new Vec<TopLevlDecl>
// ready to be subsequently written to the decls file before we
/// MDE: Is the TopLevlDecl for the entry point? Or for the entry and all
/// exits? Why does this function return a vec rather than one TopLevlDecl?
// Given a function signature, return a new Vec<TopLevlDecl>.
// It will be written to the decls file before we
// walk the function body looking for exit points.
// See rustc_parse::parser::item::grok_fn_sig.
/// MDE: As elsewhere, please use a better name than "map".
#[allow(rustc::default_hash_types)]
fn grok_fn_sig<'a>(
decl: &'a Box<FnDecl>,
Expand All @@ -1586,6 +1648,8 @@ fn grok_fn_sig<'a>(
} // Ready to write this var decl.
}
RepType::HashCodeStruct(ty_string) => {
/// MDE: This isn't actually temporary. It is returned. So
/// name it "result" rather than "tmp".
let mut tmp = TopLevlDecl {
map,
var_name: var_name.clone(),
Expand Down Expand Up @@ -1653,6 +1717,7 @@ fn grok_fn_sig<'a>(
}),
};
match &mut tmp.contents {
/// MDE: This is not an informative string. Please fix throughout.
None => panic!(""),
Some(contents) => {
contents.build_contents(depth_limit - 1, &mut do_write);
Expand Down Expand Up @@ -1731,6 +1796,7 @@ impl<'a> Visitor<'a> for DaikonDeclsVisitor<'a> {
static DECLS: LazyLock<Mutex<Option<std::fs::File>>> = LazyLock::new(|| Mutex::new(dtrace_open()));

// Open the decls file.
/// MDE: "dtrace" in the function name is at odds with the documentation and function body.
fn dtrace_open() -> Option<std::fs::File> {
let decls_path = format!("{}{}", *OUTPUT_PREFIX.lock().unwrap(), ".decls");
let decls = std::path::Path::new(&decls_path);
Expand Down
9 changes: 9 additions & 0 deletions compiler/rustc_parse/src/parser/item.rs
Original file line number Diff line number Diff line change
Expand Up @@ -410,6 +410,10 @@ impl<'a> DaikonDtraceVisitor<'a> {
// body: the block to insert into.
// exit_counter: unique, function-local numeric identifier for an exit ppt.
// ppt_name: program point name.
/// MDE: Throughout, why is the code stored in strings rather than
/// parsed and returned as a data structure? That would lead to
/// parse errors much closer to the point where the code is
/// generated.
// dtrace_param_blocks: Vec of logging code stored in Strings.
// ret_ty: return type of the function.
// daikon_tmp_counter: label for the next temporary variable.
Expand All @@ -428,6 +432,8 @@ impl<'a> DaikonDtraceVisitor<'a> {
vec![String::from(ppt_name), String::from(&*exit_counter.to_string())],
DTRACE_EXIT,
);
/// MDE: Would it make sense to perform this operation within
/// build_instrument_code (or within an overload of it?
*exit_counter += 1;
// FIXME: create overloads of build_instrument_code specialized for
// common tasks like creating exit ppts, which may also do operations
Expand Down Expand Up @@ -619,6 +625,8 @@ impl<'a> DaikonDtraceVisitor<'a> {
// stmt after all inserted stmts.
// loc: index representing the index to the stmt to process.
// body: surrounding block containing the stmt.
/// MDE: Was this called "loc" elsewhere? Please use consistent naming
/// throughout.
// exit_counter: int representing the next number to use to
// label an exit ppt.
// ppt_name: the program point name.
Expand Down Expand Up @@ -1373,6 +1381,7 @@ impl<'a> DaikonDtraceVisitor<'a> {
format!("{}{}", dtrace_print_fields, dtrace_print_fields_epilogue())
}

/// MDE: Yes, I agree with this TODO.
// FIXME: dtrace calls should be represented with a better data structures rather than
// Strings.
// Given a function signature, generate a set of dtrace calls for each parameter,
Expand Down
Loading