-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathapi_streaming.html
More file actions
148 lines (148 loc) · 13.1 KB
/
api_streaming.html
File metadata and controls
148 lines (148 loc) · 13.1 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
<!-- HTML header for doxygen 1.8.10-->
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/xhtml;charset=UTF-8"/>
<meta http-equiv="X-UA-Compatible" content="IE=9"/>
<meta name="generator" content="Doxygen 1.9.4"/>
<title>librsync: Streaming API</title>
<link href="tabs.css" rel="stylesheet" type="text/css"/>
<script type="text/javascript" src="jquery.js"></script>
<script type="text/javascript" src="dynsections.js"></script>
<link href="doxygen.css" rel="stylesheet" type="text/css" />
</head>
<body>
<div id="top"><!-- do not remove this div, it is closed by doxygen! -->
<!-- ad -->
<script async src="//pagead2.googlesyndication.com/pagead/js/adsbygoogle.js"></script>
<!-- librsync -->
<ins class="adsbygoogle"
style="display:block"
data-ad-client="ca-pub-3547096055927362"
data-ad-slot="8322976738"
data-ad-format="auto"></ins>
<script>
(adsbygoogle = window.adsbygoogle || []).push({});
</script>
<div id="titlearea">
<table cellspacing="0" cellpadding="0">
<tbody>
<tr style="height: 56px;">
<td id="projectalign" style="padding-left: 0.5em;">
<div id="projectname">librsync
 <span id="projectnumber">2.3.4</span>
</div>
</td>
</tr>
</tbody>
</table>
</div>
<!-- end header part -->
<!-- Generated by Doxygen 1.9.4 -->
<script type="text/javascript" src="menudata.js"></script>
<script type="text/javascript" src="menu.js"></script>
<script type="text/javascript">
/* @license magnet:?xt=urn:btih:d3d9a9a6595521f9666a5e94cc830dab83b65699&dn=expat.txt MIT */
$(function() {
initMenu('',false,false,'search.php','Search');
});
/* @license-end */
</script>
<div id="main-nav"></div>
</div><!-- top -->
<div><div class="header">
<div class="headertitle"><div class="title">Streaming API </div></div>
</div><!--header-->
<div class="contents">
<div class="textblock"><p ><a class="anchor" id="md_doc_streaming"></a> A key design requirement for librsync is that it should handle data as and when the hosting application requires it. librsync can be used inside applications that do non-blocking IO or filtering of network streams, because it never does IO directly, or needs to block waiting for data.</p>
<p >Arbitrary-length input and output buffers are passed to the library by the application, through an instance of <a class="el" href="librsync_8h.html#abf9f543dbfe5c1e60c8ed1c93d087767">rs_buffers_t</a>. The library proceeds as far as it can, and returns an <a class="el" href="librsync_8h.html#adec85b529224f0240ae1afccff827462" title="Return codes from nonblocking rsync operations.">rs_result</a> value indicating whether it needs more data or space.</p>
<p >All the state needed by the library to resume processing when more data is available is kept in a small opaque <a class="el" href="librsync_8h.html#add6622b38e3fa557301a190876d5ed4a" title="Job of work to be done.">rs_job_t</a> structure. After creation of a job, repeated calls to <a class="el" href="librsync_8h.html#aede8e0f42424b9aa29093f94b59ea029" title="Run a rs_job state machine until it blocks (RS_BLOCKED), returns an error, or completes (RS_DONE).">rs_job_iter()</a> in between filling and emptying the buffers keeps data flowing through the stream. The <a class="el" href="librsync_8h.html#adec85b529224f0240ae1afccff827462" title="Return codes from nonblocking rsync operations.">rs_result</a> values returned may indicate</p>
<ul>
<li><a class="el" href="librsync_8h.html#a7feb858ceba3b8f3cf048d49be108253a739063053a289b5c3393d78cc77b41b2" title="Completed successfully.">RS_DONE</a>: processing is complete</li>
<li><a class="el" href="librsync_8h.html#a7feb858ceba3b8f3cf048d49be108253af2d289bbc65678b4b00f56a2e6632957" title="Blocked waiting for more data.">RS_BLOCKED</a>: processing has blocked pending more data</li>
<li>one of various possible errors in processing (see <a class="el" href="librsync_8h.html#adec85b529224f0240ae1afccff827462" title="Return codes from nonblocking rsync operations.">rs_result</a>.)</li>
</ul>
<p >These can be converted to a human-readable string by <a class="el" href="librsync_8h.html#a53c3ba4c320a497218a86936e8ee802c" title="Return an English description of a rs_result value.">rs_strerror()</a>.</p>
<dl class="section note"><dt>Note</dt><dd>Smaller buffers have high relative handling costs. Application performance will be improved by using buffers of at least 32kb or so on each call.</dd></dl>
<dl class="section see"><dt>See also</dt><dd><a class="el" href="api_whole.html">Whole-file API</a> - Simpler but more limited interface than the streaming interface.</dd>
<dd>
<a class="el" href="api_pull.html">Pull API</a> - Intermediate-complexity callback interface.</dd>
<dd>
<a class="el" href="api_callbacks.html">IO callbacks</a> - for reading from the basis file when doing a "patch" operation.</dd></dl>
<h1><a class="anchor" id="autotoc_md80"></a>
Creating Jobs</h1>
<p >All streaming librsync jobs are initiated using a <code>_begin</code> function to create a <a class="el" href="librsync_8h.html#add6622b38e3fa557301a190876d5ed4a" title="Job of work to be done.">rs_job_t</a> object, passing in any necessary initialization parameters. The various jobs available are:</p>
<ul>
<li><a class="el" href="librsync_8h.html#a226f57577fcad2e4c97f3e4d612650d3" title="Start generating a signature.">rs_sig_begin()</a>: Calculate the signature of a file.</li>
<li><a class="el" href="librsync_8h.html#a5898e9ed00b5cdfe5fa58bfa0f25821c" title="Read a signature from a file into an rs_signature structure in memory.">rs_loadsig_begin()</a>: Load a signature into memory.</li>
<li><a class="el" href="delta_8c.html#a59fe8536218632417097d084606bb675" title="Prepare to compute a streaming delta.">rs_delta_begin()</a>: Calculate the delta between a signature and a new file.</li>
<li><a class="el" href="librsync_8h.html#a15efa6e180d239fee2c195874618b0ea" title="Apply a delta to a basis file to recreate the new file.">rs_patch_begin()</a>: Apply a delta to a basis to recreate the new file.</li>
</ul>
<p >Additionally, the following helper functions can be used to get the recommended signature arguments from the input file's size.</p>
<ul>
<li><a class="el" href="librsync_8h.html#a3c4c3fcea2814610109536f3ae029c5e" title="Get or check signature arguments for a given file size.">rs_sig_args()</a>: Get the recommended sigature arguments from the file size.</li>
</ul>
<p >After a signature has been loaded, before it can be used to calculate a delta, the hashtable needs to be initialized by calling</p>
<ul>
<li><a class="el" href="librsync_8h.html#a1c11ff785ebcd7210f5485d0c97fc812" title="Call this after loading a signature to index it.">rs_build_hash_table()</a>: Initialized the signature hashtable.</li>
</ul>
<p >The patch job accepts the patch as input, and uses a callback to look up blocks within the basis file.</p>
<p >You must configure read, write and basis callbacks after creating the job but before it is run.</p>
<h1><a class="anchor" id="autotoc_md81"></a>
Running Jobs</h1>
<p >The work of the operation is done when the application calls <a class="el" href="librsync_8h.html#aede8e0f42424b9aa29093f94b59ea029" title="Run a rs_job state machine until it blocks (RS_BLOCKED), returns an error, or completes (RS_DONE).">rs_job_iter()</a>. This includes reading from input files via the callback, running the rsync algorithms, and writing output.</p>
<p >The IO callbacks are only called from inside <a class="el" href="librsync_8h.html#aede8e0f42424b9aa29093f94b59ea029" title="Run a rs_job state machine until it blocks (RS_BLOCKED), returns an error, or completes (RS_DONE).">rs_job_iter()</a>. If any of them return an error, <a class="el" href="librsync_8h.html#aede8e0f42424b9aa29093f94b59ea029" title="Run a rs_job state machine until it blocks (RS_BLOCKED), returns an error, or completes (RS_DONE).">rs_job_iter()</a> will generally return the same error.</p>
<p >When librsync needs to do input or output, it calls one of the callback functions. <a class="el" href="librsync_8h.html#aede8e0f42424b9aa29093f94b59ea029" title="Run a rs_job state machine until it blocks (RS_BLOCKED), returns an error, or completes (RS_DONE).">rs_job_iter()</a> returns when the operation has completed or failed, or when one of the IO callbacks has blocked.</p>
<p ><a class="el" href="librsync_8h.html#aede8e0f42424b9aa29093f94b59ea029" title="Run a rs_job state machine until it blocks (RS_BLOCKED), returns an error, or completes (RS_DONE).">rs_job_iter()</a> will usually be called in a loop, perhaps alternating librsync processing with other application functions.</p>
<h1><a class="anchor" id="autotoc_md82"></a>
Deleting Jobs</h1>
<p >A job is deleted and its memory freed up using <a class="el" href="librsync_8h.html#a297b5522990715aff982f8556052c273" title="Deallocate job state.">rs_job_free()</a>.</p>
<p >This is typically called when the job has completed or failed. It can be called earlier if the application decides it wants to cancel processing.</p>
<p ><a class="el" href="librsync_8h.html#a297b5522990715aff982f8556052c273" title="Deallocate job state.">rs_job_free()</a> does not delete the output of the job, such as the sumset loaded into memory. It does delete the job's statistics.</p>
<h1><a class="anchor" id="autotoc_md83"></a>
State Machine Internals</h1>
<p >Internally, the operations are implemented as state machines that move through various states as input and output buffers become available.</p>
<p >All computers and programs are state machines. So why is the representation as a state machine a little more explicit (and perhaps verbose) in librsync than other places? Because we need to be able to let the real computer go off and do something else like waiting for network traffic, while still remembering where it was in the librsync state machine.</p>
<p >librsync will never block waiting for IO, unless the callbacks do that.</p>
<p >The current state is represented by the private field <a class="el" href="structrs__job.html#aa789990d61f5eadc6a26aeedf4bec765" title="Callback for each processing step.">rs_job_t::statefn</a>, which points to a function with a name like <code>rs_OPERATION_s_STATE</code>. Every time librsync tries to make progress, it will call this function.</p>
<p >The state function returns one of the <a class="el" href="librsync_8h.html#adec85b529224f0240ae1afccff827462" title="Return codes from nonblocking rsync operations.">rs_result</a> values. The most important values are</p>
<ul>
<li><a class="el" href="librsync_8h.html#a7feb858ceba3b8f3cf048d49be108253a739063053a289b5c3393d78cc77b41b2" title="Completed successfully.">RS_DONE</a>: Completed successfully.</li>
<li><a class="el" href="librsync_8h.html#a7feb858ceba3b8f3cf048d49be108253af2d289bbc65678b4b00f56a2e6632957" title="Blocked waiting for more data.">RS_BLOCKED</a>: Cannot make further progress at this point.</li>
<li><a class="el" href="librsync_8h.html#a7feb858ceba3b8f3cf048d49be108253a5366e53561c2ec9a897470535ea0139c" title="The job is still running, and not yet finished or blocked.">RS_RUNNING</a>: The state function has neither completed nor blocked but wants to be called again. <b>XXX</b>: Perhaps this should be removed?</li>
</ul>
<p >States need to correspond to suspension points. The only place the job can resume after blocking is at the entry to a state function.</p>
<p >Therefore states must be "all or nothing" in that they can either complete, or restart without losing information.</p>
<p >Basically every state needs to work from one input buffer to one output buffer.</p>
<p >States should never generally return <a class="el" href="librsync_8h.html#a7feb858ceba3b8f3cf048d49be108253a739063053a289b5c3393d78cc77b41b2" title="Completed successfully.">RS_DONE</a> directly. Instead, they should call rs__job_done(), which sets the state function to rs__s_done(). This makes sure that any pending output is flushed out before <a class="el" href="librsync_8h.html#a7feb858ceba3b8f3cf048d49be108253a739063053a289b5c3393d78cc77b41b2" title="Completed successfully.">RS_DONE</a> is returned to the application. </p>
</div></div><!-- contents -->
</div><!-- PageDoc -->
<!-- HTML footer for doxygen 1.8.10-->
<!-- start footer part -->
<!-- ad -->
<script async src="//pagead2.googlesyndication.com/pagead/js/adsbygoogle.js"></script>
<!-- librsync -->
<ins class="adsbygoogle"
style="display:block"
data-ad-client="ca-pub-3547096055927362"
data-ad-slot="8322976738"
data-ad-format="auto"></ins>
<script>
(adsbygoogle = window.adsbygoogle || []).push({});
</script>
<!-- analytics -->
<script>
(function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){
(i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),
m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)
})(window,document,'script','//www.google-analytics.com/analytics.js','ga');
ga('create', 'UA-71109100-1', 'auto');
ga('send', 'pageview');
</script>
<hr class="footer"/><address class="footer"><small>
Generated on Sun Feb 19 2023 16:26:51 for librsync by  <a href="http://www.doxygen.org/index.html">
<img class="footer" src="doxygen.png" alt="doxygen"/>
</a> 1.9.4
</small></address>
</body>
</html>