What is Nodezilla ?
Nodezilla is a secured, distributed and fault tolerant routing system (aka
Grid Network).
Its main purpose is to serve as a link for distributed services built on top of it (like
chat, efficient video multicasting streaming, File Sharing, secured file store ...). Nodezilla provides cache
features; any server may create a local replica of any data object. These local replicas
provide faster access and robustness to network partitions. They also reduce network
congestion by localizing access traffic. It is assumed that any server in the
infrastructure may crash, leak information, or become compromised, therefore in order to
ensure data protection, redundancy and cryptographic techniques are used.
What are Nodezilla Nodes ?
Nodezilla nodes form a decentralized, self-organizing and fault-tolerant overlay
network within the Internet. Nodezilla provides efficient request routing, deterministic
object location, and load balancing in an application-independent manner. Furthermore,
Nodezilla nodes provide mechanisms that support and facilitate application-specific
object replication, caching, and fault recovery.
What services does Nodezilla provide
The first service built on top of Nodezilla's distributed routing is the now very
popular File Share service, allowing people to share files with other users. One
of the main advantages provided by the Nodezilla Grid Model over classical decentralized
P2P networks, is the introduction of
Persistent file Sharing, where a file is still available
to download even if the original "sharer" goes offline, making popular content
available 100% of the time at 100% of the client bandwidth (shameless plus), see below for details.
Currently implemented (completely or partially) services are:
Introducing persistent sharing
Persistent sharing is the ability to share a file and makes it persist (i.e. available
to other users) even after the sharing node disappears from the network (or the
file disappears from the original node). The way the File Sharing Service is implemented
over Nodezilla's distributed router also allows to download at very high rate (theoretically),
as all parts pertaining to the requested file come from numerous nodes, no easy
bandwidth bottleneck should pop up. To achieve this the Nodezilla Network needs
of course some room to store the persistent files, this room will be found in
a space provided by each Nodezilla user (like 200MB by node). This storage place
will be used to store blocks of persistent files.
How does NPFS work ?
In Nodezilla's Persistent File Share Service (NPFS from now on) a file is not made
persistent as soon as it is shared, it's only when this file has been downloaded
several times (i.e. gains some popularity) that the persistent process starts. The
file is split into blocks, which are encoded using an information dispersal algorithm.
Blocks are then disseminated all around the nodes, the more downloaded they are
the more nodes that will cache blocks. The blocks will be cached in the storage size
made available to the whole network by each user, i.e. on a 20 Meg storage size
you can store around 78 blocks of 128K. This cache will be managed by the NPFS
and will hold blocks of popular files. When the original sharer goes offline
(or the file disappears from the original node), all blocks or at least enough of
them should be available from other Nodezilla nodes to reconstruct the original
file. Over time, the least popular files will see their cache rate decrease to
finally completely disappear, making space available for other blocks.
What about free-loaders ?
A free loader is a person who downloads files from other people, but does not
share files (i.e. doesn't contribute back to the network content). They consume
bandwidth and CPU power selfishly, and that is Bad. To try to reduce the impact
of freeloaders, the NPFS introduce the notion of credits. A minimum amount of
credits is required to download a given file. The credits available are determined
by the disk space you give to the NPFS to store cached blocks, the running time
of your node and some other things. This way a "regular" user will have no problem
to download whatever he wants, and free loaders will give some room to cache making
them effectively share files (even if its not their files) through the cached
blocks.
Anonymity and cryptography
Cryptography is a very important part of Nodezilla's router and services. From
communication between nodes (through TLS) to object identification and signatures,
all important data is encrypted and signed using current algorithms (no home
made weak crypto algorithms). Cryptographic certificates are used all over Nodezilla,
more details in the Nodezilla
Architecture Document. Anonymity is also an important
thing, no user names, no identifiers, no file names. Someone spying on the network
can't tell what you're doing on the Nodezilla network. A node can't know what objects
are used from him and a node can't know which nodes create objects on the grid.
Faults tolerance
The Nodezilla distributed routing allows for failsafe operations, it means in plain
english that requests should never fail because of node failure. This directly
implies that Nodezilla is a totally decentralized system (no master servers), and
that all requests will be served at any given time (that doesn't mean for instance
that all objects will be available, but that
a request to an existing object will always be served whatever the network state
is). The Nodezilla Object Store is secured in such a way which prevents a pirate from
hijacking an object and providing false content or answers for a given request
(effectively preventing overload or flood attacks for objects).
Target audience
This is the first release of Nodezilla and of the NPFS, it's targeted at power
users who are quite fluent at
P2P world,
network configurations and having a box running
nearly all day long. During the initial test phase beginners should not use Nodezilla
as it may be quite complicated to understand compared to other currently simpler
p2p products available. The Nodezilla node will be available for win32 platforms
and Linux. The Nodezilla client is written in Java and should run on all platforms
supporting J2RE 1.4 or later. The client will connect through the network to a running
node, making the remote operation of a node possible.
UPDATE: As of 22 November 2004, NZ is considered stable and usable by anybody.