Linux下电骡aMule Kademlia网络构建分析4

Stella981
• 阅读 480

aMule中联系人的管理

aMule中主要通过CContact,CRoutingBin和CRoutingZone这样几个类来管理它的联系人。

CContact表示一个联系人,它包含了与一个联系人有关的所有的信息,这个类的对象可能是根据从文件中读出来的信息创建的,也可能是根据其它节点发送的连接请求中的信息创建的。

CRoutingBin是CContact的容器,保存了一组CContact,也就是一个Zone的联系人。

CRoutingZone是aMule中管理联系人的核心。aMule用CRoutingZone将它的所有联系人组织为一颗二叉树。

aMule联系人的表示CContact

先来简单的看一下aMule中的联系人的表示CContact类。这个类在amule-2.3.1/src/kademlia/routing/Contact.h文件中定义:

class CContact
{
public:
    ~CContact();
    CContact(const CUInt128 &clientID,
        uint32_t ip, uint16_t udpPort, uint16_t tcpPort, uint8_t version,
        const CKadUDPKey& kadKey, bool ipVerified,
        const CUInt128 &target = CKademlia::GetPrefs()->GetKadID());

    CContact(const CContact& k1);

    const CUInt128& GetClientID() const throw()        { return m_clientID; }
    void SetClientID(const CUInt128& clientID) throw()    { m_clientID = clientID; m_distance = CKademlia::GetPrefs()->GetKadID() ^ clientID; }

    const wxString GetClientIDString() const        { return m_clientID.ToHexString(); }

    const CUInt128& GetDistance() const throw()        { return m_distance; }
    const wxString GetDistanceString() const        { return m_distance.ToBinaryString(); }

    uint32_t GetIPAddress() const throw()            { return m_ip; }
    void     SetIPAddress(uint32_t ip) throw()        { if (m_ip != ip) { SetIPVerified(false); m_ip = ip; } }

    uint16_t GetTCPPort() const throw()            { return m_tcpPort; }
    void     SetTCPPort(uint16_t port) throw()        { m_tcpPort = port; }

    uint16_t GetUDPPort() const throw()            { return m_udpPort; }
    void     SetUDPPort(uint16_t port) throw()        { m_udpPort = port; }

    uint8_t     GetType() const throw()            { return m_type; }

    void     UpdateType() throw();
    void     CheckingType() throw();

    bool     InUse() const throw()                { return m_inUse > 0; }
    void     IncUse() throw()                { m_inUse++; }
    void     DecUse()                    { if (m_inUse) m_inUse--; else { wxFAIL; } }

    time_t     GetCreatedTime() const throw()            { return m_created; }

    void     SetExpireTime(time_t value) throw()        { m_expires = value; };    
    time_t     GetExpireTime() const throw()            { return m_expires; }

    time_t     GetLastTypeSet() const throw()            { return m_lastTypeSet; }

    time_t     GetLastSeen() const throw();

    uint8_t     GetVersion() const throw()            { return m_version; }
    void     SetVersion(uint8_t value) throw()        { m_version = value; }

    const CKadUDPKey& GetUDPKey() const throw()        { return m_udpKey; }
    void     SetUDPKey(const CKadUDPKey& key) throw()    { m_udpKey = key; }

    bool     IsIPVerified() const throw()            { return m_ipVerified; }
    void     SetIPVerified(bool ipVerified) throw()        { m_ipVerified = ipVerified; }

    bool    GetReceivedHelloPacket() const throw()        { return m_receivedHelloPacket; }
    void    SetReceivedHelloPacket() throw()        { m_receivedHelloPacket = true; }

private:
    CUInt128    m_clientID;
    CUInt128    m_distance;
    uint32_t    m_ip;
    uint16_t    m_tcpPort;
    uint16_t    m_udpPort;
    uint8_t        m_type;
    time_t        m_lastTypeSet;
    time_t        m_expires;
    time_t        m_created;
    uint32_t    m_inUse;
    uint8_t        m_version;
    bool        m_ipVerified;
    bool        m_receivedHelloPacket;
    CKadUDPKey    m_udpKey;
};

我们主要关注这个类所具有的成员变量,以便于了解在aMule中,它会管理联系人的哪些信息。可以看到有ip地址,clientID,client的TCP端口号,client的UDP端口好,版本号等静态信息,以及对象创建时间,对象的有效期等动态的信息。

注意CContact构造函数的声明,target参数的默认值为本节点KadID。

再来看一下这个class的成员函数的实现:

CContact::~CContact()
{
    theStats::RemoveKadNode();
}

CContact::CContact(const CUInt128 &clientID, uint32_t ip, uint16_t udpPort, uint16_t tcpPort, uint8_t version, const CKadUDPKey& key, bool ipVerified, const CUInt128 &target)
    : m_clientID(clientID),
      m_distance(target ^ clientID),
      m_ip(ip),
      m_tcpPort(tcpPort),
      m_udpPort(udpPort),
      m_type(3),
      m_lastTypeSet(time(NULL)),
      m_expires(0),
      m_created(m_lastTypeSet),
      m_inUse(0),
      m_version(version),
      m_ipVerified(ipVerified),
      m_receivedHelloPacket(false),
      m_udpKey(key)
{
    wxASSERT(udpPort);
    theStats::AddKadNode();
}

CContact::CContact(const CContact& k1)
{
    *this = k1;
    theStats::AddKadNode();
}

void CContact::CheckingType() throw()
{
    time_t now = time(NULL);

    if(now - m_lastTypeSet < 10 || m_type == 4) {
        return;
    }

    m_lastTypeSet = now;

    m_expires = now + MIN2S(2);
    m_type++;
}

void CContact::UpdateType() throw()
{
    time_t now = time(NULL);
    uint32_t hours = (now - m_created) / HR2S(1);
    switch (hours) {
        case 0:
            m_type = 2;
            m_expires = now + HR2S(1);
            break;
        case 1:
            m_type = 1;
            m_expires = now + MIN2S(90); //HR2S(1.5)
            break;
        default:
            m_type = 0;
            m_expires = now + HR2S(2);
    }
}

time_t CContact::GetLastSeen() const throw()
{
    // calculating back from expire time, so we don't need an additional field.
    // might result in wrong values if doing CheckingType() for example, so don't use for important timing stuff
    if (m_expires != 0) {
        switch (m_type) {
            case 2: return m_expires - HR2S(1);
            case 1: return m_expires - MIN2S(90) /*(unsigned)HR2S(1.5)*/;
            case 0: return m_expires - HR2S(2);
        }
    }
    return 0;
}

这里主要关注CContact的构造函数。可以看到,m_distance被初始化为clientID和target的异或。如前面CContact构造函数的声明,则m_distance默认情况下将是clientID和本节点的KadID的异或。m_lastTypeSet和m_created被设置为了当前时间,m_expires节点有效期则被设置为了0。

跟type有关的数字,全用的是magic number,这code好烂。

CRoutingZone的构造


接着来看,CContact的二叉树是如何被一步步构造出来的。

先来看一下CRoutingZone都有哪写成员变量:

    time_t     m_nextBigTimer;
    time_t     m_nextSmallTimer;     /**
     * Zone pair is an array of two. Either both entries are null, which
     * means that *this* is a leaf zone, or both are non-null which means
     * that *this* has been split into equally sized finer zones.
     * The zone with index 0 is the one closer to our *self* token.
     */
    CRoutingZone *m_subZones[2];
    CRoutingZone *m_superZone;

    static wxString m_filename;
    static CUInt128 me;

    /**
     * The level indicates what size chunk of the address space
     * this zone is representing. Level 0 is the whole space,
     * level 1 is 1/2 of the space, level 2 is 1/4, etc.
     */
    uint32_t m_level;

    /**
     * This is the distance in number of zones from the zone at this level
     * that contains the center of the system; distance is wrt the XOR metric.
     */
    CUInt128 m_zoneIndex;

    /** List of contacts, if this zone is a leaf zone. */
    CRoutingBin *m_bin;
};

m_subZones和m_superZone是构造二叉树所必须的结构,m_subZones[0]指向子树0,m_subZones[1]指向子树1,而m_superZone则指向树中本节点的父节点。

m_bin是本CRoutingZone的CRoutingBin,如我们在本文最开头所述,这是CContact的容器,用来保存CContact,如果当前CRoutingZone不是叶子节点的话,则这个变量将为NULL。

m_level表示当前CRoutingZone在二叉树中的层次,最顶层的CRoutingZone该值为0,子CRoutingZone的m_level值是其父CRoutingZone的m_level值加一。此外,如注释中所述的那样,这个值指示了当前zone表示的地址空间的块大小。Level 0是整个空间,Level 1是1/2个空间,Level 2是1/4个,等等等等,依次类推。

回想 Linux下电骡aMule Kademlia网络构建分析I 一文中,我们有看到,CRoutingZone::ReadFile()函数在读取了文件中的联系人信息之后,会调用AddUnfiltered()将一个联系人添加到CRoutingZone中。这里我们就来仔仔细细地看一下,CContact的添加过程(amule-2.3.1/src/kademlia/routing/RoutingZone.cpp):

// Returns true if a contact was added or updated, false if the routing table was not touched.
bool CRoutingZone::AddUnfiltered(const CUInt128& id, uint32_t ip, uint16_t port, uint16_t tport, uint8_t version, const CKadUDPKey& key, bool& ipVerified, bool update, bool fromHello)
{
    if (id != me) {
        CContact *contact = new CContact(id, ip, port, tport, version, key, ipVerified);
        if (fromHello) {
            contact->SetReceivedHelloPacket();
        }
        if (Add(contact, update, ipVerified)) {
            wxASSERT(!update);
            return true;
        } else {
            delete contact;
            return update;
        }
    }
    return false;
}

在这个函数里,会首先创建一个CContact,如果节点信息来自于一个连接建立请求的Hello消息,则设置ReceivedHelloPacket,然后调用CRoutingZone::Add()函数向RoutingZone中添加节点:

bool CRoutingZone::Add(CContact *contact, bool& update, bool& outIpVerified)
{
    // If we're not a leaf, call add on the correct branch.
    if (!IsLeaf()) {
        return m_subZones[contact->GetDistance().GetBitNumber(m_level)]->Add(contact, update, outIpVerified);
    } else {
        // Do we already have a contact with this KadID?
        CContact *contactUpdate = m_bin->GetContact(contact->GetClientID());
        if (contactUpdate) {
            if (update) {
                if (contactUpdate->GetUDPKey().GetKeyValue(theApp->GetPublicIP(false)) != 0 && contactUpdate->GetUDPKey().GetKeyValue(theApp->GetPublicIP(false)) != contact->GetUDPKey().GetKeyValue(theApp->GetPublicIP(false))) {
                    // if our existing contact has a UDPSender-Key (which should be the case for all > = 0.49a clients)
                    // except if our IP has changed recently, we demand that the key is the same as the key we received
                    // from the packet which wants to update this contact in order to make sure this is not a try to
                    // hijack this entry
                    AddDebugLogLineN(logKadRouting, wxT("Sender (") + KadIPToString(contact->GetIPAddress()) + wxT(") tried to update contact entry but failed to provide the proper sender key (Sent Empty: ") + (contact->GetUDPKey().GetKeyValue(theApp->GetPublicIP(false)) == 0 ? wxT("Yes") : wxT("No")) + wxT(") for the entry (") + KadIPToString(contactUpdate->GetIPAddress()) + wxT(") - denying update"));
                    update = false;
                } else if (contactUpdate->GetVersion() >= 1 && contactUpdate->GetVersion() < 6 && contactUpdate->GetReceivedHelloPacket()) {
                    // legacy kad2 contacts are allowed only to update their RefreshTimer to avoid having them hijacked/corrupted by an attacker
                    // (kad1 contacts do not have this restriction as they might turn out as kad2 later on)
                    // only exception is if we didn't received a HELLO packet from this client yet
                    if (contactUpdate->GetIPAddress() == contact->GetIPAddress() && contactUpdate->GetTCPPort() == contact->GetTCPPort() && contactUpdate->GetVersion() == contact->GetVersion() && contactUpdate->GetUDPPort() == contact->GetUDPPort()) {
                        wxASSERT(!contact->IsIPVerified());    // legacy kad2 nodes should be unable to verify their IP on a HELLO
                        outIpVerified = contactUpdate->IsIPVerified();
                        m_bin->SetAlive(contactUpdate);
                        AddDebugLogLineN(logKadRouting, CFormat(wxT("Updated kad contact refreshtimer only for legacy kad2 contact (%s, %u)")) % KadIPToString(contactUpdate->GetIPAddress()) % contactUpdate->GetVersion());
                    } else {
                        AddDebugLogLineN(logKadRouting, CFormat(wxT("Rejected value update for legacy kad2 contact (%s -> %s, %u -> %u)")) 
                            % KadIPToString(contactUpdate->GetIPAddress()) % KadIPToString(contact->GetIPAddress()) % contactUpdate->GetVersion() % contact->GetVersion());
                        update = false;
                    }
                } else {
#ifdef __DEBUG__
                    // just for outlining, get removed anyway
                    //debug logging stuff - remove later
                    if (contact->GetUDPKey().GetKeyValue(theApp->GetPublicIP(false)) == 0) {
                        if (contact->GetVersion() >= 6 && contact->GetType() < 2) {
                            AddDebugLogLineN(logKadRouting, wxT("Updating > 0.49a + type < 2 contact without valid key stored ") + KadIPToString(contact->GetIPAddress()));
                        }
                    } else {
                        AddDebugLogLineN(logKadRouting, wxT("Updating contact, passed key check ") + KadIPToString(contact->GetIPAddress()));
                    }

                    if (contactUpdate->GetVersion() >= 1 && contactUpdate->GetVersion() < 6) {
                        wxASSERT(!contactUpdate->GetReceivedHelloPacket());
                        AddDebugLogLineN(logKadRouting, CFormat(wxT("Accepted update for legacy kad2 contact, because of first HELLO (%s -> %s, %u -> %u)"))
                            % KadIPToString(contactUpdate->GetIPAddress()) % KadIPToString(contact->GetIPAddress()) % contactUpdate->GetVersion() % contact->GetVersion());
                    }
#endif
                    // All other nodes (Kad1, Kad2 > 0.49a with UDPKey checked or not set, first hello updates) are allowed to do full updates
                    // do not let Kad1 responses overwrite Kad2 ones
                    if (m_bin->ChangeContactIPAddress(contactUpdate, contact->GetIPAddress()) && contact->GetVersion() >= contactUpdate->GetVersion()) {
                        contactUpdate->SetUDPPort(contact->GetUDPPort());
                        contactUpdate->SetTCPPort(contact->GetTCPPort());
                        contactUpdate->SetVersion(contact->GetVersion());
                        contactUpdate->SetUDPKey(contact->GetUDPKey());
                        // don't unset the verified flag (will clear itself on ipchanges)
                        if (!contactUpdate->IsIPVerified()) {
                            contactUpdate->SetIPVerified(contact->IsIPVerified());
                        }
                        outIpVerified = contactUpdate->IsIPVerified();
                        m_bin->SetAlive(contactUpdate);
                        if (contact->GetReceivedHelloPacket()) {
                            contactUpdate->SetReceivedHelloPacket();
                        }
                    } else {
                        update = false;
                    }
                }
            }
            return false;
        } else if (m_bin->GetRemaining()) {
            update = false;
            // This bin is not full, so add the new contact
            return m_bin->AddContact(contact);
        } else if (CanSplit()) {
            // This bin was full and split, call add on the correct branch.
            Split();
            return m_subZones[contact->GetDistance().GetBitNumber(m_level)]->Add(contact, update, outIpVerified);
        } else {
            update = false;
            return false;
        }
    }
}

这个函数在添加节点时,主要分为以下的几种情况来处理:

1. 当前的这个RoutingZone不是二叉树的一个叶子节点。若当前的这个RoutingZone不是二叉树的一个叶子节点,则把contact加入到子树中。那究竟要将contact添加进0和1两个子树中的哪一个呢?这主要由contact的m_distance,对应于当前层的那一位上的值决定。(contact的m_distance通常是contact与本Kademlia节点KadID的异或值。)

amule-2.3.1/src/kademlia/utils/UInt128.h:

/** Bit at level 0 being most significant. */
    unsigned GetBitNumber(unsigned bit) const throw()
    {
        return bit <= 127 ? (m_data[bit / 32] >> (31 - (bit % 32))) & 1 : 0;
    }

比如当前为第0层,如果m_distance的第0位上是0,则节点将被放进当前RoutingZone的子树0中,若为1,则将被放入当前RoutingZone的子树1中。

这个地方还可以再看一下,UInt128的第n位的含义,0~31位存放于m_data[0],32~63位存放于m_data[1],其它各位存放的m_data index依此类推。

m_data[0]的最高位,也就是第31位为UInt128的第0位,m_data[0]的第30位为UInt128的第1位,UInt128的其它各位存放的位置依此类推。

2. 当前的RoutingZone为叶子节点,但其m_bin中已经有了clientID与要加入的contact的clientID相同的contact。如果是这种情况,则将更新原contact的信息。

3. 当前的RoutingZone为叶子节点,其RoutingBin m_bin中仍然可以放下更多的contact。在这种情况下,则直接向RoutingBin中添加contact。

4. 当前的RoutingZone为叶子节点,其RoutingBin m_bin中无法放下更多的节点,但是RoutingZone可以被分割。如果是这种情况,则首先执行RoutingZone的分割,然后依照第1中情况中的规则,添加contact。

这里可以再来看一下判断RoutingZone是否可以被分割的依据:

bool CRoutingZone::CanSplit() const throw()
{
    // Max levels allowed.
    if (m_level >= 127) {
        return false;
    }

    // Check if this zone is allowed to split.
    return ((m_zoneIndex < KK || m_level < KBASE) && m_bin->GetSize() == K);
}

CRoutingZone::CanSplit()根据多个条件来判断一个RoutingZone是否可以被分割。

如果m_level大于等于127,也就是KadID的位长度,则直接返回不能再分割。

此外只有当前RoutingZone的RoutingBin中所包含的contact数达到上限时才有可能被分割。此上限也就是K,在amule-2.3.1/src/kademlia/kademlia/Defines.h中,我们可以看到,这个值是被定义为了10。

RoutingBin中所包含的contact数达到上限,并不是RoutingZone能够被分割的充分条件。在此之外,还需要满足两个条件中的一个:RoutingZone的层次不能太低,也就是说m_level不能太大,具体点说,就是小于KBASE,在amule-2.3.1/src/kademlia/kademlia/Defines.h中,我们可以看到,这个值是被定义为了4;或者m_zoneIndex不能太大,小于KK值,该值定义为5。

再来看下分割动作具体如何执行:

void CRoutingZone::Split()
{
    StopTimer();
        
    m_subZones[0] = GenSubZone(0);
    m_subZones[1] = GenSubZone(1);

    ContactList entries;
    m_bin->GetEntries(&entries);
    m_bin->m_dontDeleteContacts = true;
    delete m_bin;
    m_bin = NULL;

    for (ContactList::const_iterator it = entries.begin(); it != entries.end(); ++it) {
        if (!m_subZones[(*it)->GetDistance().GetBitNumber(m_level)]->m_bin->AddContact(*it)) {
            delete *it;
        }
    }
}

。。。。。。

CRoutingZone *CRoutingZone::GenSubZone(unsigned side)
{
    wxASSERT(side <= 1);

    CUInt128 newIndex(m_zoneIndex);
    newIndex <<= 1;
    newIndex += side;
    return new CRoutingZone(this, m_level + 1, newIndex);
}


void CRoutingZone::StopTimer()
{
    CKademlia::RemoveEvent(this);
}

在CRoutingZone::Split()中将本RoutingZone分割为两个。

1. 它首先调用StopTimer()停掉定时器。在 Linux下电骡aMule Kademlia网络构建分析3 一文中,我们有看到在定期会被执行的CKademlia::Process()函数中,会执行RoutingZone的OnSmallTimer()等函数。此处调用StopTimer()也就是停掉这些函数的定期执行。

2. 然后两次调用GenSubZone(),创建两个子RoutingZone。对于GenSubZone(),主要看一下,RoutingZone的zoneIndex和level的计算。

3. 从RoutingBin m_bin中获取所有的contacts。然后删除m_bin。

4. 将所有的contacts依据规则添加进两个子RoutingZone中。

由前面的分析我们看到,无论是什么样的case,要实际添加contact,最终总会调到CRoutingBin::AddContact(),这里我们再来看一下这个函数的实现(amule-2.3.1/src/kademlia/routing/RoutingBin.cpp):

bool CRoutingBin::AddContact(CContact *contact)
{
    wxASSERT(contact != NULL);

    uint32_t sameSubnets = 0;
    // Check if we already have a contact with this ID in the list.
    for (ContactList::const_iterator it = m_entries.begin(); it != m_entries.end(); ++it) {
        if (contact->GetClientID() == (*it)->GetClientID()) {
            return false;
        }
        if ((contact->GetIPAddress() & 0xFFFFFF00) == ((*it)->GetIPAddress() & 0xFFFFFF00)) {
            sameSubnets++;
        }
    }
    // Several checks to make sure that we don't store multiple contacts from the same IP or too many contacts from the same subnet
    // This is supposed to add a bit of protection against several attacks and raise the resource needs (IPs) for a successful contact on the attacker side
    // Such IPs are not banned from Kad, they still can index, search, etc so multiple KAD clients behind one IP still work

    if (!CheckGlobalIPLimits(contact->GetIPAddress(), contact->GetUDPPort())) {
        return false;
    }

    // no more than 2 IPs from the same /24 netmask in one bin, except if its a LANIP (if we don't accept LANIPs they already have been filtered before)
    if (sameSubnets >= 2 && !::IsLanIP(wxUINT32_SWAP_ALWAYS(contact->GetIPAddress()))) {
        AddDebugLogLineN(logKadRouting, wxT("Ignored kad contact (IP=") + KadIPPortToString(contact->GetIPAddress(), contact->GetUDPPort()) + wxT(") - too many contact with the same subnet in RoutingBin"));
        return false;
    }

    // If not full, add to the end of list
    if (m_entries.size() < K) {
        m_entries.push_back(contact);
        AdjustGlobalTracking(contact->GetIPAddress(), true);
        return true;
    }
    return false;
}

即便是走到了CRoutingBin::AddContact(),contact也并不一定会真的被加进本节点的联系人列表中。

如上所示,在CRoutingBin::AddContact()中还是会再次进行过滤。

1. CRoutingBin::AddContact()首先会遍历已经保存的所有的contact。遍历的过程中会计算这一zone中,与所传入contact处于相同子网段内的contact的数量。同时会检查要添加的contact的clientID是否与某个已经保存了的contact的clientID相同,如果是则直接返回;如前面看到的,这种case应该在CRoutingZone::Add()中处理的。

2. 执行CheckGlobalIPLimits()对IP及UDPPort做一个检查。若检查不通过,则直接返回。

3. 确保一个RoutingBin中,来自同一个/24网络掩码内的contact IP不多于2个,除非它是一个LANIP。

4. 还要再次检查RoutingBin已经保存的contacts的数量是否超出了上限K,也就是10个。没有超出时,才会真正地添加contact。并调整GlobalTracking。

此处的CheckGlobalIPLimits()和检查LanIP的方法还是值得我们再看一下。

先来看CheckGlobalIPLimits(),其执行过程如下:

bool CRoutingBin::CheckGlobalIPLimits(uint32_t ip, uint16_t DEBUG_ONLY(port))
{
    // no more than 1 KadID per IP
    uint32_t sameIPCount = 0;
    GlobalTrackingMap::const_iterator itIP = s_globalContactIPs.find(ip);
    if (itIP != s_globalContactIPs.end()) {
        sameIPCount = itIP->second;
    }
    if (sameIPCount >= MAX_CONTACTS_IP) {
        AddDebugLogLineN(logKadRouting, wxT("Ignored kad contact (IP=") + KadIPPortToString(ip, port) + wxT(") - too many contacts with the same IP (global)"));
        return false;
    }
    //  no more than 10 IPs from the same /24 netmask global, except if its a LANIP (if we don't accept LANIPs they already have been filtered before)
    uint32_t sameSubnetGlobalCount = 0;
    GlobalTrackingMap::const_iterator itSubnet = s_globalContactSubnets.find(ip & 0xFFFFFF00);
    if (itSubnet != s_globalContactSubnets.end()) {
        sameSubnetGlobalCount = itSubnet->second;
    }
    if (sameSubnetGlobalCount >= MAX_CONTACTS_SUBNET && !::IsLanIP(wxUINT32_SWAP_ALWAYS(ip))) {
        AddDebugLogLineN(logKadRouting, wxT("Ignored kad contact (IP=") + KadIPPortToString(ip, port) + wxT(") - too many contacts with the same subnet (global)"));
        return false;
    }
    return true;
}

这里主要确保两个事情:

1. 在全局所有的contact中,每个IP不多于一个KadID。也就是对于一个IP而言,只能有一个contact。

2. 在全局所有的contact中,位于相同的/24子网中的contacts不多于10个,除非要加入的节点的ip是LanIP。

这个地方的检查规则与前面CRoutingBin::AddContact()的很相似,只不过这里是在全局contacts范围内。

我们可以再来看一下,在aMule中,是如何判断一个IP地址是否是LanIP的。在amule-2.3.1/src/NetworkFunctions.cpp中:

/**
 * Used to store the ranges.
 */
struct IPRange
{
    const wxChar *addr;
    unsigned int mask;
    bool isLAN;
};


const IPRange ranges[] = {
//    Here is reserved blocks from RFC 3330 at http://www.rfc-editor.org/rfc/rfc3330.txt
//
//Address Block                      Present Use                           Reference
//----------------------------------------------------------------------------------
{ wxT("0.0.0.0"),        8, false }, // "This" Network             [RFC1700, page 4]
{ wxT("10.0.0.0"),       8, true  }, // Private-Use Networks               [RFC1918]
// Acording to RFC3330, 24.* and 14.* must be parsed as normal ips.
//{ wxT("14.0.0.0"),       8, false }, // Public-Data Networks     [RFC1700, page 181]
//{ wxT("24.0.0.0"),       8, false }, // Cable Television Networks                 --
{ wxT("39.0.0.0"),       8, false }, // Reserved but subject
                                     //    to allocation                   [RFC1797]
{ wxT("127.0.0.0"),      8, false }, // Loopback                   [RFC1700, page 5]
{ wxT("128.0.0.0"),     16, false }, // Reserved but subject
                                     //    to allocation                          --
{ wxT("169.254.0.0"),   16, false }, // Link Local                                --
{ wxT("172.16.0.0"),    12, true  }, // Private-Use Networks               [RFC1918]
{ wxT("191.255.0.0"),   16, false }, // Reserved but subject
                                     //    to allocation                          --
{ wxT("192.0.0.0"),     24, false }, // Reserved but subject          
                                     //    to allocation                          --
{ wxT("192.0.2.0"),     24, false }, // Test-Net
{ wxT("192.88.99.0"),   24, false }, // 6to4 Relay Anycast                 [RFC3068]
{ wxT("192.168.0.0"),   16, true  }, // Private-Use Networks               [RFC1918]
{ wxT("198.18.0.0"),    15, false }, // Network Interconnect
                                     //    Device Benchmark Testing        [RFC2544]
{ wxT("223.255.255.0"), 24, false }, // Reserved but subject
                                     //    to allocation                          --
{ wxT("224.0.0.0"),      4, false }, // Multicast                          [RFC3171]
{ wxT("240.0.0.0"),      4, false }  // Reserved for Future Use    [RFC1700, page 4]
};


struct filter_st {
    uint32 addr;        // Address and mask in anti-host order.
    uint32 mask;
};

const unsigned int number_of_ranges = sizeof(ranges) / sizeof(IPRange);
static filter_st filters[number_of_ranges];


// This function is used to initialize the IP filter
bool SetupFilter()
{
    for (unsigned int i = 0; i < number_of_ranges; ++i) {
        filters[i].addr = StringIPtoUint32( ranges[i].addr );
        filters[i].mask = ~wxUINT32_SWAP_ALWAYS((1 << (32 - ranges[i].mask)) - 1);
    }
    return true;
}




bool IsLanIP(uint32_t ip) throw()
{
    for (unsigned int i = 0; i < number_of_ranges; i++) {
        if (((ip ^ filters[i].addr) & filters[i].mask) == 0) {
            return ranges[i].isLAN;
        }
    }
    return false;
}

这里先是定义了一张表ranges,其中包含了RFC已有明确定义的网段,网段可能是Lan的,也可能不是。

在SetupFilter()中,会根据ranges表,构造一个filters表,主要就是将ranges表中没一个item的IP字段和mask字段做一个适当的转换,以方便查询。

判断一个ip是否是LanIP,则遍历filters表,找到该IP属于哪个网段。然后找到ranges表中的对应项,根据找到的项的属性,来确认一个IP是否是LanIP。

联系人CContact的查找


CRoutingZone提供了多种查找一个联系人的方式。可以根据contact的clientID来查找,可以根据contact的ip和端口号来查询,也可查询满足maxType和minKadVersion的随便的一个contact。

根据contact的clientID查询

根据contact的clientID查询使用CRoutingZone::GetContact(const CUInt128& id)函数,其定义如下:

CContact *CRoutingZone::GetContact(const CUInt128& id) const throw()
{
    if (IsLeaf()) {
        return m_bin->GetContact(id);
    } else {
        CUInt128 distance = CKademlia::GetPrefs()->GetKadID();
        distance ^= id;
        return m_subZones[distance.GetBitNumber(m_level)]->GetContact(id);
    }
}

如果当前RoutingZone是叶子节点,则直接在它的RoutingBin m_bin中查找相应的contact。否则的话,就计算出要查找的contact与本节点的distance,递归地在适当的子RoutingZone中查找。

然后来看CRoutingBin::GetContact(const CUInt128 &id),实现如下:

CContact *CRoutingBin::GetContact(const CUInt128 &id) const throw()
{
    for (ContactList::const_iterator it = m_entries.begin(); it != m_entries.end(); ++it) {
        if ((*it)->GetClientID() == id) {
            return *it;
        }
    }
    return NULL;
}

在这个函数中,遍历保存的所有的contact,逐个对比clientID,找出满足条件的返回给调用者。找不到时返回NULL。

根据contact的ip和端口号查询

根据contact的ip和端口号来查询使用GetContact(uint32_t ip, uint16_t port, bool tcpPort),实现如下:

CContact *CRoutingZone::GetContact(uint32_t ip, uint16_t port, bool tcpPort) const throw()
{
    if (IsLeaf()) {
        return m_bin->GetContact(ip, port, tcpPort);
    } else {
        CContact *contact = m_subZones[0]->GetContact(ip, port, tcpPort);
        return (contact != NULL) ? contact : m_subZones[1]->GetContact(ip, port, tcpPort);
    }
}

同样,如果当前RoutingZone是叶子节点,则直接在它的RoutingBin m_bin中查找相应的contact。否则,就递归地查找子zone 0,如果没找到,则递归地查找子zone 1。在RoutingBin中:

CContact *CRoutingBin::GetContact(uint32_t ip, uint16_t port, bool tcpPort) const throw()
{
    for (ContactList::const_iterator it = m_entries.begin(); it != m_entries.end(); ++it) {
        CContact *contact = *it;
        if ((contact->GetIPAddress() == ip)
            && ((!tcpPort && port == contact->GetUDPPort()) || (tcpPort && port == contact->GetTCPPort()) || port == 0)) {
            return contact;
        }
    }
    return NULL;
}

同样是遍历保存的所有的contact,查找完全满足条件者。

查询满足maxType和minKadVersion的随机contact

然后是查询满足maxType和minKadVersion的随便的一个contact,使用CRoutingZone::GetRandomContact():

CContact *CRoutingZone::GetRandomContact(uint32_t maxType, uint32_t minKadVersion) const
{
    if (IsLeaf()) {
        return m_bin->GetRandomContact(maxType, minKadVersion);
    } else {
        unsigned zone = GetRandomUint16() & 1 /* GetRandomUint16() % 2 */;
        CContact *contact = m_subZones[zone]->GetRandomContact(maxType, minKadVersion);
        return (contact != NULL) ? contact : m_subZones[1 - zone]->GetRandomContact(maxType, minKadVersion);
    }
}

同样,如果当前RoutingZone是叶子节点,则直接在它的RoutingBin m_bin中查找相应的contact。否则,随机地从两个子zone中选择一个递归地查找,如果没找到,则递归地查找另一个子zone。在RoutingBin中:

CContact *CRoutingBin::GetRandomContact(uint32_t maxType, uint32_t minKadVersion) const
{
    if (m_entries.empty()) {
        return NULL;
    }

    // Find contact
    CContact *lastFit = NULL;
    uint32_t randomStartPos = GetRandomUint16() % m_entries.size();
    uint32_t index = 0;

    for (ContactList::const_iterator it = m_entries.begin(); it != m_entries.end(); ++it) {
        if ((*it)->GetType() <= maxType && (*it)->GetVersion() >= minKadVersion) {
            if (index >= randomStartPos) {
                return *it;
            } else {
                lastFit = *it;
            }
        }
        index++;
    }

    return lastFit;
}

这段code也没有什么太特别的地方,此处不再赘述。

查找与某个contact距离最近的一组contacts

CRoutingZone还提供了GetClosestTo()函数,来查找与某个contact距离最近的一组contact。其实现逻辑如下:

void CRoutingZone::GetClosestTo(uint32_t maxType, const CUInt128& target, const CUInt128& distance, uint32_t maxRequired, ContactMap *result, bool emptyFirst, bool inUse) const
{
    // If leaf zone, do it here
    if (IsLeaf()) {
        m_bin->GetClosestTo(maxType, target, maxRequired, result, emptyFirst, inUse);
        return;
    }

    // otherwise, recurse in the closer-to-the-target subzone first
    int closer = distance.GetBitNumber(m_level);
    m_subZones[closer]->GetClosestTo(maxType, target, distance, maxRequired, result, emptyFirst, inUse);

    // if still not enough tokens found, recurse in the other subzone too
    if (result->size() < maxRequired) {
        m_subZones[1-closer]->GetClosestTo(maxType, target, distance, maxRequired, result, false, inUse);
    }
}

同样,如果当前RoutingZone是叶子节点,则直接在它的RoutingBin m_bin中查找。否则,从两个子zone中与目标client距离更近的子zone中递归地查找。如果找到的contact数量不足,则递归地查找另一个子zone。

此处还可以再看一下,在aMule的Kademlia网络中,两个节点之间的距离的语义。距离distance是通过两个Kademlia网络节点的KadID通过异或操作计算而来,KadID是一个随机的值。所以distance是在两个Kademlia节点的KadID确定之后的一个逻辑距离,而并没有实际的物理意义,比如,这个距离与实际的网络中,两个节点的物理位置毫无关系。

那根据这个distance逻辑距离,又是怎么确定远近的呢?假设本节点为S,有A、B、C三个节点,它们与S的距离分别为DA、DB、DC(算法为两个KadID的异或),将DA、DB、DC转化为数字值LA,LB、LC(算法为distance值中每个位上的值与该位的权值相乘后相加,其中位0的权值为2^127,位1的权值为2^126,位2的权值为2^125,。。。),若abs(LA-LB) < abs(LA - LC),则认为节点A与B的距离比节点A与C的距离要近。

也可以通过CUInt128的几个比较的operator的实现来看出这一点(在amule-2.3.1/src/kademlia/utils/UInt128.h中):

bool operator<  (const CUInt128& value) const throw() {return (CompareTo(value) <  0);}
    bool operator>  (const CUInt128& value) const throw() {return (CompareTo(value) >  0);}
    bool operator<= (const CUInt128& value) const throw() {return (CompareTo(value) <= 0);}
    bool operator>= (const CUInt128& value) const throw() {return (CompareTo(value) >= 0);}
    bool operator== (const CUInt128& value) const throw() {return (CompareTo(value) == 0);}
    bool operator!= (const CUInt128& value) const throw() {return (CompareTo(value) != 0);}

他们都通过CompareTo()来实现比较逻辑(在amule-2.3.1/src/kademlia/utils/UInt128.cpp中):

int CUInt128::CompareTo(const CUInt128 &other) const throw()
{
    for (int i = 0; i < 4; ++i) {
        if (m_data[i] < other.m_data[i])
            return -1;
        if (m_data[i] > other.m_data[i])
            return 1;
    }
    return 0;
}

在RoutingBin中:

void CRoutingBin::GetClosestTo(uint32_t maxType, const CUInt128 &target, uint32_t maxRequired, ContactMap *result, bool emptyFirst, bool inUse) const
{
    // Empty list if requested.
    if (emptyFirst) {
        result->clear();
    }

    // No entries, no closest.
    if (m_entries.size() == 0) {
        return;
    }

    // First put results in sort order for target so we can insert them correctly.
    // We don't care about max results at this time.
    for (ContactList::const_iterator it = m_entries.begin(); it != m_entries.end(); ++it) {
        if ((*it)->GetType() <= maxType && (*it)->IsIPVerified()) {
            CUInt128 targetDistance((*it)->GetClientID() ^ target);
            (*result)[targetDistance] = *it;
            // This list will be used for an unknown time, Inc in use so it's not deleted.
            if (inUse) {
                (*it)->IncUse();
            }
        }
    }

    // Remove any extra results by least wanted first.
    while (result->size() > maxRequired) {
        // Dec in use count.
         if (inUse) {
              (--result->end())->second->DecUse();
        }
         // Remove from results
         result->erase(--result->end());
    }
}

这里可以看一下aMule中,所谓的distance closest的语义。在这个函数中也就是找出所有的type小于maxType,同时IP被Verified的contacts。

如果找到的节点过多,还会从结果map中再移除一些出去。

那IP Verified究竟又是什么含义呢?我们搜索所有创建CContact的地方,可以知道,已经连接上的CContact的IP会是Verified的。

Done。

点赞
收藏
评论区
推荐文章
blmius blmius
3年前
MySQL:[Err] 1292 - Incorrect datetime value: ‘0000-00-00 00:00:00‘ for column ‘CREATE_TIME‘ at row 1
文章目录问题用navicat导入数据时,报错:原因这是因为当前的MySQL不支持datetime为0的情况。解决修改sql\mode:sql\mode:SQLMode定义了MySQL应支持的SQL语法、数据校验等,这样可以更容易地在不同的环境中使用MySQL。全局s
Wesley13 Wesley13
3年前
java将前端的json数组字符串转换为列表
记录下在前端通过ajax提交了一个json数组的字符串,在后端如何转换为列表。前端数据转化与请求varcontracts{id:'1',name:'yanggb合同1'},{id:'2',name:'yanggb合同2'},{id:'3',name:'yang
皕杰报表之UUID
​在我们用皕杰报表工具设计填报报表时,如何在新增行里自动增加id呢?能新增整数排序id吗?目前可以在新增行里自动增加id,但只能用uuid函数增加UUID编码,不能新增整数排序id。uuid函数说明:获取一个UUID,可以在填报表中用来创建数据ID语法:uuid()或uuid(sep)参数说明:sep布尔值,生成的uuid中是否包含分隔符'',缺省为
待兔 待兔
5个月前
手写Java HashMap源码
HashMap的使用教程HashMap的使用教程HashMap的使用教程HashMap的使用教程HashMap的使用教程22
Jacquelyn38 Jacquelyn38
3年前
2020年前端实用代码段,为你的工作保驾护航
有空的时候,自己总结了几个代码段,在开发中也经常使用,谢谢。1、使用解构获取json数据let jsonData  id: 1,status: "OK",data: 'a', 'b';let  id, status, data: number   jsonData;console.log(id, status, number )
Wesley13 Wesley13
3年前
Java日期时间API系列31
  时间戳是指格林威治时间1970年01月01日00时00分00秒起至现在的总毫秒数,是所有时间的基础,其他时间可以通过时间戳转换得到。Java中本来已经有相关获取时间戳的方法,Java8后增加新的类Instant等专用于处理时间戳问题。 1获取时间戳的方法和性能对比1.1获取时间戳方法Java8以前
Stella981 Stella981
3年前
Django中Admin中的一些参数配置
设置在列表中显示的字段,id为django模型默认的主键list_display('id','name','sex','profession','email','qq','phone','status','create_time')设置在列表可编辑字段list_editable
Wesley13 Wesley13
3年前
unity将 -u4E00 这种 编码 转汉字 方法
 unity中直接使用 JsonMapper.ToJson(对象),取到的字符串,里面汉字可能是\\u4E00类似这种其实也不用转,服务器会通过类似fastjson发序列化的方式,将json转对象,获取对象的值就是中文但是有时服务器要求将传参中字符串中类似\\u4E00这种转汉字,就需要下面 publ
Stella981 Stella981
3年前
Linux下电骡aMule Kademlia网络构建分析5 —— 资源的发布
资源发布请求消息的发送在aMule中,主要用CSharedFileListclass来管理共享给其它节点的文件。如我们前面在Linux下电骡aMuleKademlia网络构建分析3(http://my.oschina.net/wolfcs/blog/488387)一文中分析的那样,aMule在启动的时候,会
Python进阶者 Python进阶者
10个月前
Excel中这日期老是出来00:00:00,怎么用Pandas把这个去除
大家好,我是皮皮。一、前言前几天在Python白银交流群【上海新年人】问了一个Pandas数据筛选的问题。问题如下:这日期老是出来00:00:00,怎么把这个去除。二、实现过程后来【论草莓如何成为冻干莓】给了一个思路和代码如下:pd.toexcel之前把这