Browse code

Added the first library snippet

Andrew Alderwick authored on 30/08/2021 22:20:34
Showing 1 changed files
1 1
new file mode 100644
... ...
@@ -0,0 +1,180 @@
1
+(
2
+
3
+# Summary
4
+
5
+Reads a file in chunks - perfect for when you have a small buffer or when you
6
+don't know the file size. Copes with files up to 4,294,967,295 bytes long.
7
+
8
+# Code
9
+
10
+)
11
+@file-read-chunks ( func* udata* buf* size* filename* -- func* udata'* buf* size* filename* )
12
+
13
+	#0000 DUP2                 ( F* U* B* SZ* FN* OL* OH* / )
14
+	&resume
15
+	ROT2 STH2                  ( F* U* B* SZ* OL* OH*     / FN* )
16
+	ROT2                       ( F* U* B* OL* OH* SZ*     / FN* )
17
+
18
+	&loop
19
+	STH2kr .File/name DEO2     ( F* U* B* OL* OH* SZ*     / FN* )
20
+	STH2k .File/length DEO2    ( F* U* B* OL* OH*         / FN* SZ* )
21
+	STH2k .File/offset-hs DEO2 ( F* U* B* OL*             / FN* SZ* OH* )
22
+	STH2k .File/offset-ls DEO2 ( F* U* B*                 / FN* SZ* OH* OL* )
23
+	SWP2                       ( F* B* U*                 / FN* SZ* OH* OL* )
24
+	ROT2k NIP2                 ( F* B* U* B* F*           / FN* SZ* OH* OL* )
25
+	OVR2 .File/load DEO2       ( F* B* U* B* F*           / FN* SZ* OH* OL* )
26
+	.File/success DEI2 SWP2    ( F* B* U* B* length* F*   / FN* SZ* OH* OL* )
27
+	JSR2                       ( F* B* U'* done-up-to*    / FN* SZ* OH* OL* )
28
+	ROT2 SWP2                  ( F* U'* B* done-up-to*    / FN* SZ* OH* OL* )
29
+	SUB2k NIP2                 ( F* U'* B* -done-length*  / FN* SZ* OH* OL* )
30
+	ORAk ,&not-end JCN         ( F* U'* B* -done-length*  / FN* SZ* OH* OL* )
31
+
32
+	POP2 POP2r POP2r           ( F* U'* B*                / FN* SZ* )
33
+	STH2r STH2r                ( F* U'* B* SZ* FN*        / )
34
+	JMP2r
35
+
36
+	&not-end
37
+	STH2r SWP2                 ( F* U'* B* OL* -done-length* / FN* SZ* OH* )
38
+	LTH2k JMP INC2r            ( F* U'* B* OL* -done-length* / FN* SZ* OH'* )
39
+	SUB2                       ( F* U'* B* OL'*              / FN* SZ* OH'* )
40
+	STH2r STH2r                ( F* U'* B* OL'* OH'* SZ*     / FN* )
41
+	,&loop JMP
42
+
43
+(
44
+
45
+# Arguments
46
+
47
+* func*     - address of callback routine
48
+* udata*    - userdata to pass to callback routine
49
+* buf*      - address of first byte of buffer of file's contents
50
+* size*     - size in bytes of buffer
51
+* filename* - address of filename string (zero-terminated)
52
+
53
+All of the arguments are shorts (suffixed by asterisks *).
54
+
55
+# Callback routine
56
+
57
+If you make use of userdata, the signature of the callback routine is:
58
+)
59
+	( udata* buf* length* -- udata'* done-up-to* )
60
+(
61
+
62
+* udata* and buf* are as above.
63
+* length* is the length of the chunk being worked on, which could be less than
64
+  size* when near the end of the file, and func* is called with zero length* to
65
+  signify end of file.
66
+* udata'* is the (potentially) modified userdata, to be passed on to the next
67
+  callback routine call and returned by file-read-chunks after the last chunk.
68
+* done-up-to* is the pointer to the first unprocessed byte in the buffer, or
69
+  buf* + length* if the whole chunk was processed.
70
+
71
+If you don't make use of any userdata, feel free to pretend the signature is:
72
+)
73
+	( buf* length* -- done-up-to* )
74
+(
75
+
76
+# Userdata
77
+
78
+The udata* parameter is not processed by file-read-chunks, except to keep the
79
+one returned from one callback to the next. The meaning of its contents is up
80
+to you - it could simply be a short integer or a pointer to a region of memory.
81
+
82
+# Operation
83
+
84
+file-read-chunks reads a file into the buffer you provide and calls func* with
85
+JSR2 with each chunk of data, finishing with an empty chunk at end of file.
86
+
87
+file-read-chunks loops until done-up-to* equals buf*, equivalent to when no
88
+data is processed by func*. This could be because processing cannot continue
89
+without a larger buffer, an error is detected in the data and further
90
+processing is pointless, or because the end-of-file empty chunk leaves the
91
+callback routine with no other choice.
92
+
93
+# Return values
94
+
95
+Since file-read-chunks's input parameters remain available throughout its
96
+operation, they are not automatically discarded in case they are useful to the
97
+caller.
98
+
99
+# Discussion about done-up-to*
100
+
101
+file-read-chunks is extra flexible because it doesn't just give you one chance
102
+to process each part of the file. Consider a func* routine that splits the
103
+chunk's contents into words separated by whitespace. If the buffer ends with a
104
+letter, you can't assume that letter is the end of that word - it's more likely
105
+to be the in the middle of a word that continues on. If func* returns the
106
+address of the first letter of the word so far, it will be called again with
107
+that first letter as the first character of the next chunk's buffer. There's no
108
+need to remember the earlier part of the word because you get presented with
109
+the whole lot again to give parsing another try.
110
+
111
+That said, func* must make at least _some_ progress through the chunk: if it
112
+returns the address at the beginning of the buffer, buf*, file-read-chunks will
113
+terminate and return to its caller. With our word example, a buffer of ten
114
+bytes will be unable to make progress with words that are ten or more letters
115
+long. Depending on your application, either make the buffer big enough so that
116
+progress should always be possible, or find a way to discern this error
117
+condition from everything working fine.
118
+
119
+# Discussion about recursion
120
+
121
+Since all of file-read-chunks's data is on the working and return stacks, it
122
+can be called recursively by code running in the callback routine. For example,
123
+a code assembler can process the phrase "include library.tal" by calling
124
+file-read-chunks again with library.tal as the filename. There are a couple of
125
+caveats:
126
+
127
+* the filename string must not reside inside file-read-chunk's working buffer,
128
+  otherwise it gets overwritten by the file's contents and subsequent chunks
129
+  will fail to be read properly; and
130
+
131
+* if the buffer is shared with the parent file-read-chunk, the callback routine
132
+  should stop further processing and return with done-up-to* straight away,
133
+  since the buffer contents have already been replaced by the child
134
+  file-read-chunk.
135
+
136
+# Resuming / starting operation from an arbitrary offset
137
+
138
+You can call file-read-chunks/resume instead of the main routine if you'd like
139
+to provide your own offset shorts rather than beginning at the start of the
140
+file. The effective signature for file-read-chunks/resume is:
141
+)
142
+	( func* udata* buf* size* filename* offset-ls* offset-hs* -- func* udata'* buf* size* filename* )
143
+(
144
+
145
+# Example callback routines
146
+
147
+This minimal routine is a no-op that "processes" the entire buffer each time
148
+and returns a valid done-up-to*:
149
+
150
+	@quick-but-useless
151
+		ADD2 JMP2r
152
+
153
+This extremely inefficient callback routine simply prints a single character
154
+from the buffer and asks for the next one. It operates with a buffer that is
155
+just one byte long, but for extra inefficiency you can assign a much larger
156
+buffer and it will ignore everything after the first byte each time. If the
157
+buffer is zero length it returns done-up-to* == buf* so that file-read-chunks
158
+returns properly.
159
+
160
+	@one-at-a-time
161
+		#0000 NEQ2 JMP JMP2r
162
+		LDAk .Console/write DEO
163
+		INC2 JMP2r
164
+
165
+This more efficient example writes the entire chunk to the console before
166
+requesting the next one by returning. How short can you make a routine that
167
+does the same?
168
+
169
+	@chunk-at-a-time
170
+		&loop
171
+		ORAk ,&not-eof JCN
172
+		POP2 JMP2r
173
+
174
+		&not-eof
175
+		STH2
176
+		LDAk .Console/write DEO
177
+		INC2 STH2r #0001 SUB2
178
+		,&loop JMP
179
+
180
+)