[ICA-Sheet 3 - Help], Why Soot, Why?

Moderator: Implementing code analyses for large software systems

Andreas Wittmann
Neuling
Neuling
Beiträge: 7
Registriert: 15. Okt 2014 23:52

[ICA-Sheet 3 - Help], Why Soot, Why?

Beitrag von Andreas Wittmann »

I think we need some help here. We started Implementing Exercise 2 today and 3 Test cases are missing, so basically it went pretty good i'd say. Now we're kinda Stuck at tesethod 8 - The object field setter test.

Code: Alles auswählen

private void testMethod8() {
		Container c = new Container();
		setField(c);
		Container d = new Container(c);
		leakData(d.field);
	}

private class Container {
		public Container() {
		}
		
		public Container(Container original) {
			this.field = original.field;
		}
		
		private String field;
	}

private void setField(Container d) {
		d.field = getSecret();
	}
	
So that Code thingy gives us the following analyzeable Jimple thingy:

r1 := @parameter0: de.ecspride.sse.ica.callgraphs.interprocedural.DataFlowTarget$Container;

$r2 = specialinvoke r0.<de.ecspride.sse.ica.callgraphs.interprocedural.DataFlowTarget: java.lang.String getSecret()>();

staticinvoke <de.ecspride.sse.ica.callgraphs.interprocedural.DataFlowTarget$Container: void access$1(de.ecspride.sse.ica.callgraphs.interprocedural.DataFlowTarget$Container,java.lang.String)>(r1, $r2);

And there we get the Method access$1. This is basically checks if d.field = $r2 is correct. (since this.field) is private there has to be a runtime access check if the field is writeable. And there is our problem, we dont know which field is getting the write (r1 is just the base variable) and $r2 the temp var for the get secret string.

So our approach is to just ignore all access-restrictions and taint the base variable r1 as well as every access path which starts with the base local r1. Would that be correct?

Steven
Kernelcompilierer
Kernelcompilierer
Beiträge: 425
Registriert: 2. Sep 2008 10:00
Wohnort: Frankfurt am Main

Re: [ICA-Sheet 3 - Help], Why Soot, Why?

Beitrag von Steven »

I am not sure whether I understand your problem correctly, but let's try.
Andreas Wittmann hat geschrieben:And there we get the Method access$1. This is basically checks if d.field = $r2 is correct.
The bytecode does not include any explicit access modifier checks, nor does Jimple. The compiler checks the access modifiers and does not even generate bytecode if you are accessing fields you are not allowed to access. At runtime, it is the task of the JVM to enforce these checks if the compiler messed up or you dynamically loaded code.

The reason why your access is wrapped in an access$1 method is technical. In some cases, the Java language allows you to access private fields (e.g. because the object that accesses the field is the outer class of the class containing the field). However, directly accessing the field would violate the bytecode semantics of a private field (regardless that the access is all fine in the Java language). Therefore, the compiler plays a trick: It encapsulates the field access into a public static method and calls this method.

For your analysis, this does not make any difference: You analyze the access$1 method just like any other method. If your code works on arbitrary callees, it also works on the access$1 method. No special treatment is required here.

Andreas Wittmann
Neuling
Neuling
Beiträge: 7
Registriert: 15. Okt 2014 23:52

Re: [ICA-Sheet 3 - Help], Why Soot, Why?

Beitrag von Andreas Wittmann »

And where do i get the Body?

http://imgur.com/FIti6gC

Andreas Wittmann
Neuling
Neuling
Beiträge: 7
Registriert: 15. Okt 2014 23:52

Re: [ICA-Sheet 3 - Help], Why Soot, Why?

Beitrag von Andreas Wittmann »

Never mind. Works now and passes all tests. http://i.imgur.com/0vtZhPW.png

For the other Groups i'd like to tell you what we did. I also would like to post the complete Code but i suppose i'm not allowed to, or am I? If so I'd gladly give the other Groups some help. Anyway some talk about what we did, might be helpful as well:

We implemented the Analysis pseudo-recursive. Each time we find a new Invoke-Expr we analyse it after mapping the Contexts from the Caller into the Callee. For this we used a CallContext:

Code: Alles auswählen

public class CallContext {
	private SootMethod sourceCaller;
	private SootMethod targetCallee;
	private int args_n;
	private InvokeExpr invokeExpr;
	private ArrayList<Set<AccessPath>> paramCallAP;
	private ArrayList<Set<AccessPath>> paramRetAP;
	private Set<AccessPath> returnAP = new HashSet<AccessPath>();
	private Set<AccessPath> thisAP = new HashSet<AccessPath>();
This CallContext gets pushed onto a CallStack where the new Method just grabs it and maps it into their Scope. (This can be done via a simple Statemachine: Call -> Grab Context -> Analyze -> ..... -> Return -> Map Context Back)

On the CallStack structure we implemented sanity Checks like to check when you hit a Return Stmt if the current Method actually got called before or if we just started our analysis in that method without calling it (flow insensitive). This happened frequently to us due to the fact that the analysis usually started in some init method which never got called before.

Code: Alles auswählen

	// This method checks if there has been an appropriate Call before. 
	public boolean hasBeenCalledBefore(SootMethod m){
		if(height == 0)
			return false;
		else if (!this.peek().getTargetCallee().getSignature().equals(m.getSignature()))
			return false;
		return true;
	}
	
	public boolean isDirectRecursion(SootMethod m){
		if(height == 0)
			return false;
		else if(this.peek().getSourceCaller().getSignature().equals(m.getSignature()))
			return true;
		return false;
	}
Also a indirect Recursion via-Prefix-Checks would be implementable, but wasnt neccasary for us.

Another annoying thing that we stumbled opon was the fact that any static method doesnt have a this-Variable. So we made sure that before mapping a This-Variable that the Method is non-Static:

Code: Alles auswählen

if (!(inv instanceof StaticInvokeExpr)) { .. }
The hardest Test to pass was the switchString-Test because it is a recursive-call with multiple return-paths. We introduced a merging for this case.
The second hardest Test was the objectFieldSetter-Test, mostly because of annoying this-mapping problems.

Recursion in itself wasnt too big of a problem, we implemented the easiest possible way by just setting an upper bound for the callstack.

Steven
Kernelcompilierer
Kernelcompilierer
Beiträge: 425
Registriert: 2. Sep 2008 10:00
Wohnort: Frankfurt am Main

Re: [ICA-Sheet 3 - Help], Why Soot, Why?

Beitrag von Steven »

Please do not post solutions on the forum. The lab is graded based on the solutions the various groups submit for the exercise sheets and the final project. If you have concrete questions, please ask them here or e-mail us. Exchanging ideas is totally acceptable, posting complete code snippets or solution blueprints is not. Copying code snippets from other groups, be it from the forum or somewhere else, is plagiarism.

Also please note that the ideas presented here are only one possibility to solve the exercise. There are many others. The goal of the exercise is (to some extend) to also make you evaluate your own analysis and think about various design possibilities. That's why there is for instance the question on how you treated loops and recursion. I expect various treatments with various consequences for precision, recall, and performance.

Andreas Wittmann
Neuling
Neuling
Beiträge: 7
Registriert: 15. Okt 2014 23:52

Re: [ICA-Sheet 3 - Help], Why Soot, Why?

Beitrag von Andreas Wittmann »

Could you give us some insight how many groups did actually check in the last sheet-solutions? In the last ICA-Lesson there only were 3 groups present. For this sheet i worked together with another group, because im almost alone and its nice to share some ideas/compare solutions so our solutions are quite similiar. (We denoted that on the solutions.pdf)

Steven
Kernelcompilierer
Kernelcompilierer
Beiträge: 425
Registriert: 2. Sep 2008 10:00
Wohnort: Frankfurt am Main

Re: [ICA-Sheet 3 - Help], Why Soot, Why?

Beitrag von Steven »

In total, for sheet 1, 94% of all students achieved more than 0 points. For sheet 2, 79% of all students achieved more than 0 points. The average number of points was 10.92 for sheet 1 and 7.43 for sheet 2. Both sheets had a maximum number of 15 points that could be achieved. For each sheet, one group reached the maximum number of points. This was not the same group for the two sheets.

Antworten

Zurück zu „Implementing code analyses for large software systems (ICA)“