The reason is that I want to systematically learn the syntax related to asm. I saw the book ASM Full Practice and the Shence Open Source project. I thought it would be better to use an article to summarize my knowledge about android. For some understanding, the second half of this article will provide a detailed analysis of the implementation details of Shence Open Source’s embedded projects.

The purpose of burying is actually that someone wants to obtain some information about the running of the application. The role here may be development, product, or boss; the information here may be business information, performance information, etc. of application running; burying Information needs to go through a series of complex operations and converted into easier-to-use information types to help us conduct technical or business analysis. In this process, the Android client is only a small role, but we also try to get a glimpse of the big picture. , trying to derive from us a full process map of the entire data analysis.

In fact, the whole process of data analysis can be divided into several key nodes:

  1. data collection.
  2. data transmission.
  3. Data cleaning and storage.
  4. Data visualization display.
  5. data analysis.

Let me talk about my understanding of each node.

1. Data collection

The first step is to generate data by burying points. Then the industry also has many technical means of burying points to deal with different scenarios.

Manual buried points

Our company uses Umeng’s products. This is easy to understand. Where we need to record information, we call a hidden API to store the information that needs to be recorded and save it for later use. The advantage of this solution is that it can record the most complete information, and some key information, such as payment scenarios such as order amount and discounts, are different for each user. Moreover, this solution is also the most flexible, you can record it wherever you want.
The disadvantage of this solution is that it needs to be added every time, and it can only be added by development, which makes the cost high and requires maintenance.
This solution definitely has its value, and it is not only used by the client, but also by the server, because the server can record more information, such as some key information in the database, and there are many mature server-side log libraries. For example, log4j, etc., there are also third parties that can be used.

Visual buried points

This plan is also easy to understand. It is to transfer the work of marking and burying points through development to similar products and operations. Similar to some low-code, visual assembly page, etc. ideas.
Here are the basic principles and processes:

  1. After the page is loaded (for example, DecorView will be initialized after onCreate), the client transmits the page view to the server by parsing the control tree.
  2. After the server gets the data, the front-end de-analyzes the data and re-renders the virtual phone interface on the web page.
  3. The product or operation manually marks the elements on the interface, which is the process of burying the points. The buried point data will be stored on the server side.
  4. The client will read the buried point configuration information during the next run, and trigger the buried point event reporting when the user clicks the corresponding control.

For specific details of visual buried points, please refer to the following information:

All buried points

Full buried points are also called no buried points. This solution automatically collects buried points in specific scenarios when the application is running. This method has the highest degree of automation and does not require manual buried points. However, this also makes this solution less flexible. Low, only some simple information data can be collected, and more detailed business data cannot be collected.
Generally, the usage scenarios of full buried points include:

  1. The application starts and exits (crash, active exit, lmk force kill, etc.).
  2. Application page life cycle records.
  3. Apply page control click.
  4. Performance data: cpu, memory, frame rate, network request quality, etc.

What we will focus on later is the application page control click event collection in this solution. For details, please see the analysis section of Shence Buried Open Source Project later.

2.Data transmission

Buried point data format

The corresponding data buried points have been collected in the previous step. Here is a brief introduction to the data format of the buried points:

  • Default fields: Some common fields collected, such as device ID or cookie (used to identify a unique user), device model, operating system version number, etc.
  • Event field: The first is the key of the event, followed by the information field of the event. Here, different events can also extract some common fields to facilitate unified data processing at the backend.

Regarding the stored data format, there are a few points to note:

  1. The data format of buried points corresponds to the Event+User model of data analysis, so that drill-down can be performed according to different user and event dimensions.
  2. For the same user on different ends, the corresponding user IDs may be different. Here we use ID-mapping, which is common in the industry, to connect all users.

Buried data storage

The storage of buried data is actually very particular, because the amount of data we generate is very large. For example, the data of our Android application generates about 120 million pieces of log data every day. Some optimization of the data needs to be done, otherwise there will be pressure on storage and transmission. All are relatively large.

There are two options for buried point data storage:

  • Database(sqlite).
    This solution is suitable for situations where the amount of data is relatively small, because the cost of database storage is still relatively high. For example, manual burying solutions like Umeng generally store data in the database. One advantage of storing data in the database is that data processing is more flexible. You can freely choose relevant strategies for uploading, such as uploading data for a certain period of time. If the upload fails, you can also roll back the failed data.
  • document. This solution is suitable for situations where the amount of data is relatively large. The log data can be compressed and encrypted and then stored in binary format to achieve a higher compression rate. Some high-performance logging solutions are generally used here. You can implement them yourself, or you can refer to some open source solutions. Here our company uses the secondary development of wechat’s xlog.
    One disadvantage of using file storage is that uploading may not be so flexible, because the file dimension is larger than the database dimension. Here we try to cut the file into small pieces for uploading, so that it can be easily retransmitted after failure.

Buried data transmission

Buried data transmission basically uses the https protocol stack. When uploading, be careful not to use the application’s basic network library components as much as possible to avoid logging and uploading loop calls. In addition, you can agree on some upload strategies. For example, the intervals between front and back uploads can be different, and try to balance performance and upload timeliness.

3. Data cleaning and storage.

This is mainly about some of the work done by the server after the client uploads the data. I have not done too much research on this part, so I will briefly describe it here.

  • Data reception: Use servers such as nginx to receive data.
  • Data stream processing: kafka, etc., serves as a buffer between the two processes of data access and data processing.
  • Data storage: use hdfs or clickhouse, etc.
  • Data query: Mainly using sql for query, which can assist in developing some front-end dashboards for visual query. We use grafana.

4. Data visualization display.

There is no fixed idea for visual display. It mainly involves drawing pictures to meet our subsequent data analysis needs or to satisfy observability, which allows us to automate business operations.
Here I will give a few examples using scenarios related to application full burying points. For code-related analysis, please refer to the analysis section of Shence burying point open source project later.

  • Click the application control to collect. Cluster the collected application control click data with the page as the dimension. For simple purposes, you can draw a sector chart of the percentage of page control clicks. For complex ones, you can refer to the previous idea of ​​​​visualizing points and draw a heat map of the number of clicks in the page area, so that the product or The operation can adjust the position of key businesses based on the proportion of clicks in different areas.
  • Application page life cycle event collection. The page flow of a business can be drawn into a Sankey diagram, and the user behavior path can be analyzed with the target event as the starting point and end point, and the business transformation can be analyzed and improved.

5.Data analysis

I think this is where buried data really comes into play. Only by analyzing the data can we generate positive feedback for the business. However, this part is also my shortcoming. Here I would like to share some of my conclusions in the process of learning data analysis. .
The data analysis here includes not only some traditional data analysis methods based on statistics, but also some new data analysis methods based on user portrait recommendations, machine learning and AI.

  • Behavioral event analysis model
    This model is the model closest to the original data we collected, because the single piece of data we collected is an Event.
    Behavioral event analysis generally has the following stages.
  1. Event definition and selection.
    The collected information includes: who (identifying the unique user through userId, device ID, etc.), when (recording the timestamp of the event), where (the location where the event occurred). This can be the client reporting the user IP, and the server parsing the address library. ), how (the source of the event, such as terminal identification, etc.), what (the key and event data of the recorded event).
  2. Multi-dimensional drill-down analysis.
    Here our event analysis system should support arbitrary drill-down analysis and refined condition screening.
  3. Explanation and conclusion.
  • Funnel analysis model
    This model can reflect user behavior status and user conversion rate at each stage from the starting point to the end point. You can perform online conversion rate optimization testing by comparing conversion rates under different conditions and performing a/b testing.
  • Retention Analysis Model
  • User path analysis model
    This is similar to the funnel analysis model, except that its distribution is many-to-many.
  • Cluster analysis model
    This model is different from the traditional data analysis model. It uses algorithms such as user profiling to cluster users and perform refined push and recall operations.

After mastering the above event analysis model, you can directly use the system to perform visual queries when performing simple data analysis. However, when performing some complex data queries or drawing signboards, as an Android client development, you also need to master a certain amount of SQL. ability.

Analysis of Shence’s open source projects

Buried point SDK: github.com/sensorsdata…
Buried point plug-in SDK: github.com/sensorsdata…

Here we may need some pre-basic knowledge of writing gradle plug-ins, which most readers should have mastered. It should be noted that gradle marked the method of modifying classes by transform as obsolete after 7.3. Here we need to use the new API provided by gradle. In complex cases, we may also need to use tasks to assist in solving the problem.

class V73Impl(project: Project, override val asmWrapperFactory: AsmCompatFactory) :
    AGPCompatInterface {
    
    init {
        val androidComponents = project.extensions.getByType(AndroidComponentsExtension::class.java)
        V73AGPContextImpl.asmCompatFactory = asmWrapperFactory
        androidComponents.onVariants { variant: Variant ->
            variant.instrumentation.transformClassesWith(
                SensorsDataAsmClassVisitorFactory::class.java,
                InstrumentationScope.ALL
            ) {
                ...
            }
            variant.instrumentation
                .setAsmFramesComputationMode(FramesComputationMode.COPY_FRAMES)
        }
    }

}

It can be seen that the new API is also a fixed routine. First obtain the extension AndroidComponentsExtension, and then register a conversion class on each variant. Here is SensorsDataAsmClassVisitorFactory. Some configurations can be injected into the final lambda block.
InstrumentationScope.ALL means that the code we analyze is the entire project, including our own project and third-party libraries.

Here we may also need some basic knowledge of asm. Readers must have learned it, so continue reading:

ClassVisitor:

abstract class SensorsDataAsmClassVisitorFactory :
    AsmClassVisitorFactory<ConfigInstrumentParams> {

    override fun createClassVisitor(
        classContext: ClassContext,
        nextClassVisitor: ClassVisitor
    ): ClassVisitor {
        V73AGPContextImpl.asmCompatFactory!!.onBeforeTransform()
        val classInheritance = object : ClassInheritance {
            override fun isAssignableFrom(subClass: String, superClass: String): Boolean {
                return classContext.loadClassData(subClass)?.let {
                    it.className == superClass || it.superClasses.contains(superClass) || it.interfaces.contains(superClass)
                } ?: false
            }

            override fun loadClass(className: String): ClassInfo? {
                return classContext.loadClassData(className)?.let {
                    ClassInfo(
                        it.className,
                        interfaces = it.interfaces,
                        superClasses = it.superClasses
                    )
                }
            }
        }

        return V73AGPContextImpl.asmCompatFactory!!.transform(
            nextClassVisitor, classInheritance
        )
    }

    override fun isInstrumentable(classData: ClassData): Boolean {
        return V73AGPContextImpl.asmCompatFactory!!.isInstrumentable(
            ClassInfo(
                classData.className,
                interfaces = classData.interfaces,
                superClasses = classData.superClasses
            )
        )
    }
}

Just implement the AsmClassVisitorFactory interface. You can see that compatibility with asm is very convenient. You only need to implement the specific ClassVisitor. For this project it is SAPrimaryClassVisitor.

Analyze SAPrimaryClassVisitor in order.

access class

override fun visit(
        version: Int,
        access: Int,
        name: String,
        signature: String?,
        superName: String?,
        interfaces: Array<out String>?
    ) {
        super.visit(version, access, name, signature, superName, interfaces)
        classNameAnalytics = ClassNameAnalytics(name, superName, interfaces?.asList())
        shouldReturnJSRAdapter = version <= Opcodes.V1_5
        configHookHelper.initConfigCellInClass(name)
    }

Class-related metainformation is stored in classNameAnalytics.
SAConfigHookHelper is a function provided by the plug-in that can delete some method calls through configuration.

method call in class
override fun visitMethod(
        access: Int,
        name: String?,
        descriptor: String?,
        signature: String?,
        exceptions: Array<String>?
    ): MethodVisitor? { 
        ...
        //MethodVisitor
    }
//check whether need to delete this method. if the method is deleted,
//a new method will be created at visitEnd()
if (configHookHelper.isConfigsMethod(name, descriptor)) {
    return null
}

If the configuration is hit, there are method calls that need to be deleted and recorded in mHookMethodCells of SAConfigHookHelper.

if (classNameAnalytics.superClass == "android/app/Activity"
    && name == "onNewIntent" && descriptor == "(Landroid/content/Intent;)V"
) {
    isFoundOnNewIntent = true
}

Hit the onNewIntent method.

The following is to create the corresponding MethodVisitor, which will be analyzed later.

End of access class
override fun visitEnd() {
        super.visitEnd()

        //
        if (pluginManager.isModuleEnable(SAModule.PUSH)
            && !isFoundOnNewIntent
            && classNameAnalytics.superClass == "android/app/Activity"
        ) {
            SensorsPushInjected.addOnNewIntent(classVisitor)
        }

        //
        if (pluginManager.isModuleEnable(SAModule.AUTOTRACK)) {
            FragmentHookHelper.hookFragment(
                classVisitor,
                classNameAnalytics.superClass,
                visitedFragMethods
            )
        }

        //
        configHookHelper.disableIdentifierMethod(classVisitor)
    }

Three things are done here:
1. If the Activity does not implement the onNewIntent method, add the onNewIntent method to the Activity.

 fun addOnNewIntent(classVisitor: ClassVisitor) {
        val mv = classVisitor.visitMethod(
            Opcodes.ACC_PROTECTED,
            "onNewIntent",
            "(Landroid/content/Intent;)V",
            null,
            null
        )
        mv.visitAnnotation("Lcom/sensorsdata/analytics/android/sdk/SensorsDataInstrumented;", false)
        mv.visitCode()
        mv.visitVarInsn(Opcodes.ALOAD, 0)
        mv.visitVarInsn(Opcodes.ALOAD, 1)
        mv.visitMethodInsn(
            Opcodes.INVOKESPECIAL,
            "android/app/Activity",
            "onNewIntent",
            "(Landroid/content/Intent;)V",
            false
        )
        mv.visitVarInsn(Opcodes.ALOAD, 0)
        mv.visitVarInsn(Opcodes.ALOAD, 1)
        mv.visitMethodInsn(
            Opcodes.INVOKESTATIC,
            PUSH_TRACK_OWNER,
            "onNewIntent",
            "(Ljava/lang/Object;Landroid/content/Intent;)V",
            false
        )
        mv.visitInsn(Opcodes.RETURN)
        mv.visitMaxs(2, 2)
        mv.visitEnd()
    }

Through this logic, you can learn how to add new methods.

2. Fragment life cycle method instrumentation
mainly inserts related calls of FragmentTrackHelper in the Fragment life cycle.
A little trick here is that the framework encapsulates the method calls, so there is no need to write a lot of template directive methods.

// call super
methodCell.visitMethod(mv, Opcodes.INVOKESPECIAL, superName!!)
// call injected method
methodCell.visitHookMethod(
                    mv,
                    Opcodes.INVOKESTATIC,
                    SensorsFragmentHookConfig.SENSORS_FRAGMENT_TRACK_HELPER_API
)

Here, calling super and instrumentation methods are encapsulated.

3. Clear the method body recorded in mHookMethodCells of SAConfigHookHelper.

In-method instrumentation (implementation of click event instrumentation)

The click event instrumentation we are concerned about is still in the access of the method body, or back to the visitMethod method. Here a series of nested MethodVisitors are created, and we analyze them from the outside in.

The calling sequence of MethodVisitor is as follows:

(visitParameter)*
[visitAnnotationDefault]
(visitAnnotation | visitAnnotableParameterCount | visitParameterAnnotation | visitTypeAnnotation | visitAttribute)*
[    visitCode    (        visitFrame |        visitXxxInsn |        visitLabel |        visitInsnAnnotation |        visitTryCatchBlock |        visitTryCatchAnnotation |        visitLocalVariable |        visitLocalVariableAnnotation |        visitLineNumber    )*    visitMaxs]
visitEnd
1.UpdateSDKPluginVersionMV
override fun visitFieldInsn(opcode: Int, owner: String, fieldName: String, descriptor: String) {
        if (mClassNameAnalytics.isSensorsDataAPI && "ANDROID_PLUGIN_VERSION" == fieldName && opcode == PUTSTATIC) {
            mMethodVisitor.visitLdcInsn(VersionConstant.VERSION)
        }
        super.visitFieldInsn(opcode, owner, fieldName, descriptor)
    }

The function of this class is that when the application sets the ANDROID_PLUGIN_VERSION field of SensorsDataAPI, it puts the current version number on the top of the operand stack, and then executes the instruction to complete the replacement.

2.SensorsAutoTrackMethodVisitor

This class truly realizes the function of click event instrumentation and is the focus of our analysis.

class SensorsAutoTrackMethodVisitor(
    mv: MethodVisitor,
    methodAccess: Int,
    methodName: String,
    var desc: String,
    private val classNameAnalytics: ClassNameAnalytics,
    private val visitedFragMethods: MutableSet<String>,
    lambdaMethodCells: MutableMap<String, SensorsAnalyticsMethodCell>,
    private val pluginManager: SAPluginManager
) : AdviceAdapter(
    pluginManager.getASMVersion(), mv,
    methodAccess,
    methodName,
    desc
)

You can see that AdviceAdapter is inherited here. In addition to the calling sequence mentioned earlier, two methods are added:

public override fun onMethodEnter() {}
public override fun onMethodExit(opcode: Int) {}

These two methods are called at the beginning and end of the method call respectively, making it easier for us to weave in our own custom code.
We parse this class in the order of asm traversal.

(1) Traverse annotations
override fun visitAnnotation(s: String, b: Boolean): AnnotationVisitor {
        if (s == "Lcom/sensorsdata/analytics/android/sdk/SensorsDataTrackViewOnClick;") {
            isSensorsDataTrackViewOnClickAnnotation = true
        } else if (s == "Lcom/sensorsdata/analytics/android/sdk/SensorsDataIgnoreTrackOnClick;") {
            isSensorsDataIgnoreTrackOnClick = true
        } else if (s == "Lcom/sensorsdata/analytics/android/sdk/SensorsDataInstrumented;") {
            isHasInstrumented = true
        } else if (s == "Lcom/sensorsdata/analytics/android/sdk/SensorsDataTrackEvent;") {
            return object : AnnotationVisitor(pluginManager.getASMVersion()) {
                override fun visit(key: String, value: Any) {
                    super.visit(key, value)
                    if ("eventName" == key) {
                        eventName = value as String
                    } else if ("properties" == key) {
                        eventProperties = value.toString()
                    }
                }
            }
        }
        return super.visitAnnotation(s, b)
    }

Since our main strategy for identifying clicks is to find and call the setOnClickListener method, when using some frameworks, it may not be called this name. For example, when using butterknife and databinding, you need to manually annotate these click methods so that they can be recognized and instrumented.
Here the information in the annotations is extracted.

(2) Method entry point
public override fun onMethodEnter() {
        super.onMethodEnter()
        pubAndNoStaticAccess =
            SAUtils.isPublic(access) && !SAUtils.isStatic(
                access
            )
        protectedAndNotStaticAccess =
            SAUtils.isProtected(access) && !SAUtils.isStatic(
                access
            )
        if (pubAndNoStaticAccess) {
            if (nameDesc == "onClick(Landroid/view/View;)V") {
                isOnClickMethod = true
                variableID = newLocal(Type.getObjectType("java/lang/Integer"))
                mMethodVisitor.visitVarInsn(ALOAD, 1)
                mMethodVisitor.visitVarInsn(ASTORE, variableID)
            } else { ... }
        } else if (protectedAndNotStaticAccess) {
            if (nameDesc == "onListItemClick(Landroid/widget/ListView;Landroid/view/View;IJ)V") {
                localIds = ArrayList()
                val firstLocalId = newLocal(Type.getObjectType("java/lang/Object"))
                mMethodVisitor.visitVarInsn(ALOAD, 1)
                mMethodVisitor.visitVarInsn(ASTORE, firstLocalId)
                localIds!!.add(firstLocalId)
                val secondLocalId = newLocal(Type.getObjectType("android/view/View"))
                mMethodVisitor.visitVarInsn(ALOAD, 2)
                mMethodVisitor.visitVarInsn(ASTORE, secondLocalId)
                localIds!!.add(secondLocalId)
                val thirdLocalId = newLocal(Type.INT_TYPE)
                mMethodVisitor.visitVarInsn(ILOAD, 3)
                mMethodVisitor.visitVarInsn(ISTORE, thirdLocalId)
                localIds!!.add(thirdLocalId)
            }
        }

        ...
        if (pluginManager.isHookOnMethodEnter) {
            handleCode()
        }
    }

Processing related to click events in method calls, for example, the most common definition of a click event:

private void initButton() {
        findViewById(R.id.button).setOnClickListener(new View.OnClickListener() {
            @Override
            public void onClick(View v) {

            }
        });
    }

Here in the public, nostatic branch, find the onClick method. Here the plug-in stores the View parameters in a newly opened local variable table space.

isOnClickMethod = true
variableID = newLocal(Type.getObjectType("java/lang/Integer"))
mMethodVisitor.visitVarInsn(ALOAD, 1)
mMethodVisitor.visitVarInsn(ASTORE, variableID)

The processing of other branches is similar, similar to matching the item click event of AdapterView.
Because in actual projects, parameter types have been optimized many times, the method adopted is to save relevant parameters during onMethodEnter so that they can be read and used correctly when inserting code.

Here we need to focus on one situation, the impact of lambda on method instrumentation.

D8/R8 will de-sugar the lambda syntax. Here is an example of a java lambda expression:
Original code:

public class Java8 {

    interface Logger {
        void log(String s);
    }

    public static void main(String... args) {
        test(s -> System.out.println(s))
    }

    private static void test(Logger logger) {
        logger.log("hello")
    }

}

Code after desugaring:

public class Java8 {

    interface Logger {
        void log(String s);
    }

    public static void main(String... args) {
        test(s -> new Java8$1())
    }

    //
    static void lambda$main$0(String str) {
        System.out.println(str)
    }  

    private static void test(Logger logger) {
        logger.log("hello")
    }

}

public class Java8$1 implements Java8.Logger {
    public Java8$1() {}    

    @Override
    public void log(String s) {
            Java8.lambda$main$0(s);
    }

}

After desugaring, a class that implements the interface is generated, and the method body called by the class is the code in the lambda block.
It can be seen that lambda mainmain 0 is a method generated at runtime and does not exist at compile time. The corresponding bytecode is invokedynamic.

invokedynamic instruction

The invokedynamic instruction was introduced in jdk7 and is used to implement dynamically typed language functions.
The jdk classes related to this instruction are:

  1. MethodType
public static MethodType methodType(Class<?> rtype, Class<?>[] ptypes) {
    return makeImpl(rtype, ptypes, false);
}

MethodType represents the return value type and all parameter types required by a method.

  1. MethodHandle

MethodHandle is a method handle. MethodHandle finds a specific method and executes it based on the class name, method name, and MethodType.

@RequiresApi(api = Build.VERSION_CODES.O)
    public void foo(Context context) {
        try {
            MethodType methodType = MethodType.methodType(String.class, int.class);
            MethodHandles.Lookup lookup = MethodHandles.lookup();
            MethodHandle methodHandle = lookup.findStatic(String.class, "valueOf", methodType);
            String result = (String) methodHandle.invoke(99);
            ToastUtil.showLong(context, result);
        } catch (Throwable e) {
            e.printStackTrace();
        }
    }

Here is an example of dynamically executing the String.valueOf method through MethodHandle.

  1. CallSite

CallSite is a method call point, which contains method handle information. CallSite links MethodHandle, which may be somewhat abstract.

Let’s look at an example of invokedynamic:

import java.util.Date;
import java.util.function.Consumer;

public class TestLambda {

    public void test() {
        final Date date = new Date();
        Consumer<String> consumer = s -> {
            System.out.println(s+ date);
        };
    }

}

Use javap to observe bytecode:

public void test();
    descriptor: ()V
    flags: ACC_PUBLIC
    Code:
      stack=2, locals=3, args_size=1
         0: new           #2                  // class java/util/Date
         3: dup
         4: invokespecial #3                  // Method java/util/Date."<init>":()V
         7: astore_1
         8: aload_1
         9: invokedynamic #4,  0              // InvokeDynamic #0:accept:(Ljava/util/Date;)Ljava/util/function/Consumer;
        14: astore_2
        15: return
      LineNumberTable:
        line 9: 0
        line 10: 8
        line 13: 15

The invokedynamic instruction is called here, 0 is a reserved field, and #4 is a constant pool field.

 #4 = InvokeDynamic      #0:#30         // #0:accept:(Ljava/util/Date;)Ljava/util/function/Consumer;

#0 here indicates the first bootstrap method, which points to the desugarized code of lambda. This method is dynamically generated at runtime:

BootstrapMethods:
  0: #26 invokestatic java/lang/invoke/LambdaMetafactory.metafactory:(Ljava/lang/invoke/MethodHandles$Lookup;Ljava/lang/String;Ljava/lang/invoke/MethodType;Ljava/lang/invoke/MethodType;Ljava/lang/invoke/MethodHandle;Ljava/lang/invoke/MethodType;)Ljava/lang/invoke/CallSite;
    Method arguments:
      #27 (Ljava/lang/Object;)V
      #28 invokestatic com/sensorsdata/sdk/demo/TestLambda.lambda$test$0:(Ljava/util/Date;Ljava/lang/String;)V
      #29 (Ljava/lang/String;)V

What is actually called is the LambdaMetafactory.metafactory method.

public static CallSite metafactory(MethodHandles.Lookup caller,
                                       String invokedName,
                                       MethodType invokedType,
                                       MethodType samMethodType,
                                       MethodHandle implMethod,
                                       MethodType instantiatedMethodType)

Here we focus on the last three parameters, the first three are generated by the fixed system:

    Method arguments:
      #27 (Ljava/lang/Object;)V
      #28 invokestatic com/sensorsdata/sdk/demo/TestLambda.lambda$test$0:(Ljava/util/Date;Ljava/lang/String;)V
      #29 (Ljava/lang/String;)V

samMethodType: The signature information of the abstract method in the functional interface. This corresponds to the accept method of the Consumer interface. Due to generic parameter erasure, this is Object.
implMethod: The static method actually generated after desugaring. You can see that the method generated here has two parameters. This is because lambda refers to the final variable date outside the lambda block.

  private static void lambda$test$0(java.util.Date, java.lang.String);
    descriptor: (Ljava/util/Date;Ljava/lang/String;)V
    flags: ACC_PRIVATE, ACC_STATIC, ACC_SYNTHETIC
    Code:
      stack=3, locals=2, args_size=2
         0: getstatic     #5                  // Field java/lang/System.out:Ljava/io/PrintStream;
         3: new           #6                  // class java/lang/StringBuilder
         6: dup
         7: invokespecial #7                  // Method java/lang/StringBuilder."<init>":()V
        10: aload_1
        11: invokevirtual #8                  // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
        14: aload_0
        15: invokevirtual #9                  // Method java/lang/StringBuilder.append:(Ljava/lang/Object;)Ljava/lang/StringBuilder;
        18: invokevirtual #10                 // Method java/lang/StringBuilder.toString:()Ljava/lang/String;
        21: invokevirtual #11                 // Method java/io/PrintStream.println:(Ljava/lang/String;)V
        24: return

instantiatedMethodType: the actual type of samMethodType, where the generic parameter is restored to String.

invokedynamic instrumentation

With the above basic knowledge, we can analyze invokedynamic instrumentation.
Use a simple click event for sample analysis:

private void initLambdaButton() {
    Button button = (Button) findViewById(R.id.lambdaButton);
    button.setOnClickListener(v -> {

    });
}

The framework pre-adds a lambda configuration for click events:

addLambdaMethod(
    SensorsAnalyticsMethodCell(
        "onClick",
        "(Landroid/view/View;)V",
        "Landroid/view/View$OnClickListener;",
        "trackViewOnClick",
        "(Landroid/view/View;)V",
        1, 1,
        listOf(Opcodes.ALOAD)
    )
)

Then intercept the invokedynamic instruction:

override fun visitInvokeDynamicInsn(
    name1: String,
    desc1: String,
    bsm: Handle,
    vararg bsmArgs: Any
) {
    super.visitInvokeDynamicInsn(name1, desc1, bsm, *bsmArgs)
    if (!pluginManager.extension.lambdaEnabled) {
        return
    }
    try {
        val owner = bsm.owner
        if ("java/lang/invoke/LambdaMetafactory" != owner) {
            return
        }
        val desc2 = (bsmArgs[0] as Type).descriptor
        val sensorsAnalyticsMethodCell: SensorsAnalyticsMethodCell? =
            SensorsAnalyticsHookConfig.LAMBDA_METHODS.get(
                Type.getReturnType(desc1).descriptor + name1 + desc2
            )
        if (sensorsAnalyticsMethodCell != null) {
            val it = bsmArgs[1] as Handle
            mLambdaMethodCells[it.name + it.desc] = sensorsAnalyticsMethodCell
        }
    } catch (e: Exception) {
        warn("Some exception happened when call visitInvokeDynamicInsn: className: " + classNameAnalytics.className + ", error message: " + e.localizedMessage)
    }
}

The function of this method is to generate the map of mLambdaMethodCells. Its key is the name+desc of the generated method after desugarization, and the value is the lambda configuration method SensorsAnalyticsMethodCell that we have pre-embedded above.
In order to understand this method, you need to use the knowledge of lambda and invokedynamic instructions mentioned above. The name1 and desc1 of this method refer to the first parameter of the invokedynamic instruction. bsm is an instance related to the boot method metafactory, and bsmArgs is related to the boot method. Parameters, namely the samMethodType, implMethod and instantiatedMethodType we analyzed earlier.

(3) Method exit point

At the method exit point, the real instrumentation function is implemented.

public override fun onMethodExit(opcode: Int) {
    super.onMethodExit(opcode)
    if (!pluginManager.isHookOnMethodEnter) {
        handleCode()
    }
}

As you can see, the main logic is in handleCode.

Function 1: Call instrumentation on Fragment related methods.
In ClassVisitor, methods that are not implemented by Fragment are generated and instrumented. If the class has implemented the relevant methods of Fragment, you only need to insert buried calls.

if (SAPackageManager.isInstanceOfFragment(classNameAnalytics.superClass)) {
            val sensorsAnalyticsMethodCell: SensorsAnalyticsMethodCell? =
                SensorsFragmentHookConfig.FRAGMENT_METHODS[nameDesc]
            if (sensorsAnalyticsMethodCell != null) {
                visitedFragMethods.add(nameDesc)
//                mMethodVisitor.visitVarInsn(ALOAD, 0)
                for (i in 0 until sensorsAnalyticsMethodCell.paramsCount) {
                    mMethodVisitor.visitVarInsn(
                        sensorsAnalyticsMethodCell.opcodes[i],
                        localIds!![i]
                    )
                }
                mMethodVisitor.visitMethodInsn(
                    INVOKESTATIC,
                    SensorsFragmentHookConfig.SENSORS_FRAGMENT_TRACK_HELPER_API,
                    sensorsAnalyticsMethodCell.agentName,
                    sensorsAnalyticsMethodCell.agentDesc,
                    false
                )
                isHasTracked = true
                return
            }
        }

Here we insert the corresponding method of FragmentTrackHelper.

Function 2: Process lambda calls.
I have talked about the processing of lambda expressions for a long time, and it is finally time to harvest.

val lambdaMethodCell: SensorsAnalyticsMethodCell? = mLambdaMethodCells[nameDesc]

The code at the beginning looks a bit confusing. Isn’t the namedesc that generated the desugaring method stored in mLambdaMethodCells earlier? Why is it that the nameDesc of the current method is used when fetching it here?
In fact, the dimension of our analysis is wrong. Here we have come to the methodVisitor of the desugar method, and mLambdaMethodCells is shared in the ClassVisitor dimension. We have done so much preparation in the past, and it turns out that it is to intercept the call of the real desugaring method. Otherwise, the name of the desugaring method is generated by the virtual machine, and we have no way of knowing which method carries the click call of lambda.

for (i in paramStart until paramStart + lambdaMethodCell.paramsCount) {
    mMethodVisitor.visitVarInsn(
        lambdaMethodCell.opcodes.get(i - paramStart),
        localIds!![i - paramStart]
    )
}

First load the variables needed for the instrumentation method from the local variable table to the operand stack.

mMethodVisitor.visitMethodInsn(
    INVOKESTATIC,
    SensorsAnalyticsHookConfig.SENSORS_ANALYTICS_API,
    lambdaMethodCell.agentName,
    lambdaMethodCell.agentDesc,
    false
)

After loading the parameters, call the trackViewOnClick method of SensorsDataAutoTrackHelper.

Function 3: Special processing for Android Tv.

if (isAndroidTv && SAPackageManager.isInstanceOfActivity(classNameAnalytics.superClass) && nameDesc == "dispatchKeyEvent(Landroid/view/KeyEvent;)Z") {
    mMethodVisitor.visitVarInsn(ALOAD, 0)
    mMethodVisitor.visitVarInsn(ALOAD, 1)
    mMethodVisitor.visitMethodInsn(
        INVOKESTATIC,
        SensorsAnalyticsHookConfig.SENSORS_ANALYTICS_API,
        "trackViewOnClick",
        "(Landroid/app/Activity;Landroid/view/KeyEvent;)V",
        false
    )
    isHasTracked = true
    return
}

Since Android TV has physical buttons, it is necessary to additionally intercept the dispatchKeyEvent event and instrument it.

Function 4: Process click events.

Here I finally come to my original intention, to learn how to instrument click events.

if (isOnClickMethod && classNameAnalytics.className == "android/databinding/generated/callback/OnClickListener") {
    trackViewOnClick(mMethodVisitor, 1)
    isHasTracked = true
    return
}

Instrument and bury the click event of databinding.

if (isSensorsDataTrackViewOnClickAnnotation && desc == "(Landroid/view/View;)V") {
    trackViewOnClick(mMethodVisitor, 1)
    isHasTracked = true
    return
}

For third-party frameworks, use the annotation method to insert and bury points.

if (isOnClickMethod) {
    trackViewOnClick(mMethodVisitor, variableID)
    isHasTracked = true
}

Use the most common method of anonymous internal class click calling for instrumentation.

private fun trackViewOnClick(mv: MethodVisitor, index: Int) {
    mv.visitVarInsn(ALOAD, index)
    mv.visitMethodInsn(
        INVOKESTATIC,
        SensorsAnalyticsHookConfig.SENSORS_ANALYTICS_API,
        "trackViewOnClick",
        "(Landroid/view/View;)V",
        false
    )
}

I have learned asm for so long, it is very simple and I won’t explain it anymore.

(4) Method traversal ends

override fun visitEnd() {
    super.visitEnd()
    if (isHasTracked) {
        if (pluginManager.extension.lambdaEnabled) {
            mLambdaMethodCells.remove(nameDesc)
        }
        visitAnnotation(
            "Lcom/sensorsdata/analytics/android/sdk/SensorsDataInstrumented;",
            false
        )
    }
}

Do some recycling.

3. SensorsAnalyticsPushMethodVisitor

To support push, the main purpose here is to inject APIs related to PushAutoTrackHelper. Due to space reasons, I will not go into details here.

4. SensorsAnalyticsWebViewMethodVisitor

For special processing of WebView, the main thing here is to connect h5 and app. The connection here means that the h5 page data embedded in the client is uniformly forwarded to the native side, and the client reports the hidden points in a unified manner. Opening up the data storage and transmission capabilities of the client can be used uniformly to reduce the loss rate of buried data. It is also part of our ID-mapping function to achieve unified user ID identification.
The function of this MethodVisitor is to automatically inject JsBridge.

//
positionList.reversed().forEach { tmp ->
    loadLocal(tmp)
}
val newDesc = SAUtils.appendDescBeforeGiven(desc, VIEW_DESC)
mv.visitMethodInsn(INVOKESTATIC, JS_BRIDGE_API, name, newDesc, false)

Replace WebView related calls with JSHookAop related calls. For example, the loadUrl method of WebView is replaced by the loadUrl method of JSHookAop.

The loadUrl method calls the setupH5Bridge method.

private static void setupH5Bridge(View webView) {
    if (isSupportJellyBean() && SensorsDataAPI.getConfigOptions() != null &&
            SensorsDataAPI.getConfigOptions().isAutoTrackWebView()) {
        setupWebView(webView);
    }
    if (isSupportJellyBean()) {
        SAModuleManager.getInstance().invokeModuleFunction(Modules.Visual.MODULE_NAME, Modules.Visual.METHOD_ADD_VISUAL_JAVASCRIPTINTERFACE, webView);
    }
}

Inject jsBridge:

private static void setupWebView(View webView) {
    if (webView != null && webView.getTag(com.sensorsdata.analytics.android.sdk.R.id.sensors_analytics_tag_view_webview) == null) {
        webView.setTag(com.sensorsdata.analytics.android.sdk.R.id.sensors_analytics_tag_view_webview, new Object());
        H5Helper.addJavascriptInterface(webView, new AppWebViewInterface(webView.getContext().getApplicationContext(), null, false, webView), "SensorsData_APP_New_H5_Bridge");
    }
}

Summarize:

At this point, the entire embedded plug-in has been fully analyzed. We can see that it not only has the click event instrumentation we originally wanted to know about, but also has many functions such as lambda compatibility, fragment life cycle event instrumentation, etc. A complete sdk is still It is much more complicated than implementing a demo.

Leave a Reply

Your email address will not be published. Required fields are marked *