/images/avatar.png

Time boundary in InfluxDB Group by Time Statement

These days I use InfluxDB to save some time series data. I love these features it provides:

High Performance

According to to it’s hardware guide, a single node will support more than 750k point write per second, 100 moderate queries per second and 10M series cardinality.

Continuous Queries

Simple aggregation can be done by InfluxDB’s continuous queries.

Overwrite Duplicated Points

If you submit a new point with same measurements, tag set and timestamp, the new data will overwrite the old one.

C3 Linearization and Python MRO(Method Resolution Order)

Python supports multiple inheritance, its class can be derived from more than one base classes. If the specified attribute or methods was not found in current class, how to decide the search sequence from superclasses? In simple scenario, we know left-to right, bottom to up. But when the inheritance hierarchy become complicated, it’s not easy to answer by intuition.

For instance, what’s search sequence of class M?

class X:pass
class Y: pass
class Z:pass
class A(X,Y):pass
class B(Y,Z):pass
class M(B,A,Z):pass

The answer is: M, B, A, X, Y, Z, object

Difference between Value and Pointer variable in Defer in Go

defer is a useful function to do cleanup, as it will execute in LIFO order before the surrounding function returns. If you don’t know how it works, sometimes the execution result may confuse you.

How it Works and Why Value or Pointer Receiver Matters

I found an interesting code on Stack Overflow:

type X struct {
    S string
}

func (x X) Close() {
    fmt.Println("Value-Closing", x.S)
}

func (x *X) CloseP() {
    fmt.Println("Pointer-Closing", x.S)
}

func main() {
    x := X{"Value-X First"}
    defer x.Close()
    x = X{"Value-X Second"}
    defer x.Close()

    x2 := X{"Value-X2 First"}
    defer x2.CloseP()
    x2 = X{"Value-X2 Second"}
    defer x2.CloseP()

    xp := &X{"Pointer-X First"}
    defer xp.Close()
    xp = &X{"Pointer-X Second"}
    defer xp.Close()

    xp2 := &X{"Pointer-X2 First"}
    defer xp2.CloseP()
    xp2 = &X{"Pointer-X2 Second"}
    defer xp2.CloseP()
}

The output is:

Near-duplicate with SimHash

Before talking about SimHash, let’s review some other methods which can also identify duplication.

Longest Common Subsequence(LCS)

This is the algorithm used by diff command. It is also edit distance with insertion and deletion as the only two edit operations.

This works good for short strings. However, the algorithm’s time complexity is \(O(m*n)\), if two strings’ lengths are \(m\) and \(n\) respectively. So it’s not suitable for large corpus. Also, if two corpus consists of same paragraph but the order is not same. LCS treat them as different corpus, and that’s not we expected.

Jaeger Code Structure

Here is the main logic for jaeger agent and jaeger collector. (Based on jaeger 1.13.1)

Jaeger Agent

Collect UDP packet from 6831 port, convert it to model.Span, send to collector by gRPC

Jaeger Collector

Process gRPC or process packet from Zipkin(port 9411).

Jaeger Query

Listen gRPC and HTTP request from 16686.

The Annotated The Annotated Transformer

Thanks for the articles I list at the end of this post, I understand how transformers works. These posts are comprehensive, but there are some points that confused me.

First, this is the graph that was referenced by almost all of the post related to Transformer.

Transformer consists of these parts: Input, Encoder*N, Output Input, Decoder*N, Output. I’ll explain them step by step.

Input

The input word will map to 512 dimension vector. Then generate Positional Encoding(PE) and add it to the original embeddings.