Public comments are a very important part of the OGF document approval process. Through public comments, documents are given scrutiny by people with a wide range of expertise and interests. Ideally, a OGF document will be self-contained, relying only on the other documents and standards it cites to be clear and useful. Public comments of any type are welcomed, from small editorial comments to broader comments about the scope or merit of the proposed document. The simple act of reading a document and providing a public comment that you read it and found it suitable for publication is very useful, and provides valuable feedback to the document authors.
Thank you for making public comments on this document!
Comments for Document: JSDL Parameter Sweep Job Extension
|Author(s):||M. Drescher, A. Anjomshoaa, G. Williams, D. Meredith|
|Public Comment End:||31 Dec, 2008|
I'm reading the current draft document and have some doubts concerning the implementability of matching using some types of XPath expressions. For example, consider:
substring(/*//jsdl-posix:Argument, 11, 2)
The potential problem I see is that the XPath evaluator (e.g. some external library ) would return an atom -- a simple string in this case. How do you imagine using this output to perform the actual replacement in the document?
I don't know if there's any XPath evaluator that would return both the actual value (evaluated substring, in this example) and the node, because only then the implementator of Parameter Sweep engine could perform the replacement. Please correct me if I'm wrong.
Many thanks for your comment. I can certainly understand your concern regarding how to actually implement the spec as I am also involved with an implementation. However, the remit of the specification is to define what replacements are required, not how to perform those replacements (which, as you say is performed by a PS engine whose remit it is to actually do the select and modify substitutions).
In your example, in order to perform the substring function substitution, I imagine that the PS engine would indeed need to select the whole (parent) node (e.g. dom node) so that a substring substitution can be performed on the node’s value. To do this, the PS engine would have to parse/analyse the xpath beforehand so that the mutable parent node can be selected and subsequently updated.
FYI, have a look at: http://www.w3.org/2007/01/applets/xpathApplet.html
If you paste in the xpath expession: substring(/*//jsdl-posix:Argument, 11, 2), the parser will analyse the xpath and produce a parse tree (output shown below, note the substring function is shown in the tree). The code to do this is also available, which could be used to parse and analyse the xpath tree prior to performing document substitutions. I hope this helps.
| FunctionQName substring
| Slash /
| Wildcard *
| SlashSlash //
| QName jsdl-posix:Argument
| IntegerLiteral 2
| IntegerLiteral 11
| IntegerLiteral 2
LoopInteger almost works, but:-
1) The zero value should not be allowed for the step attribute.
2) The text explaining the termination condition for the loop needs work. As it stands, we read "ending with a final value smaller or equal to the end value" (4.2) and "If 'nextValue' exceeds the value of eht [sic] end value", which will misbehave if e.g. step is negative, or the end value is smaller than the start value.
3) It is surely desirable for an implementation to be able to determine the trip count of the loop by inspection. If an implementation has to execute the loop to determine the cardinality of the set of parameters it will generate, it becomes vulnerable to accidentally or maliciously pathological loop bounds. One way to fix this is to take a clue from Fortran, which defines that a loop
DO I = istart, iend, istep
will execute exactly MAX((iend-istart+istep)/istep,0). If you make this clear in the spec, it will help to remove any residual ambiguities in termination conditions and make it much easier for implementors to write defensive code.
4) Supporting exceptions makes (3) harder to achieve, because it's a little awkward to determine whether a listed exception will match one of the parameters that the loop will generate. Do you really need to support exceptions?
LoopDouble has all the same problems as LoopInteger, plus some additional ones of its own. There are very good reasons why Fortran90 changed the DO construct to make the use of floating point start, end and step attributes obsolescent. See e.g. appendix C2 of "Fortran 90/95 Explained" by Metcalf and Reid. Here, you have the additional complication that the double precision numbers that you represent in the XML are decimal, and will have to be converted into a binary representation by the implementation before executing the loop, and back to decimal afterwards. In finite precision, an arbitrary real number that can be expressed exactly as a floating point decimal will NOT in general have an exact representation in binary, so it will need to be rounded to the nearest available binary number (and vice versa). So you can't tell by inspection whether a given LoopDouble will be iterated n-1 or n times, and worse, you can't even guarantee that the same implementation will produce the same trip count on different hardware.
From the above, it should be clear that supporting exceptions for LoopDouble will also be problematic. You could consider restricting the allowable exceptions to special numbers that do have exact binary and decimal representations (e.g. 0.0, 1.0), but I don't think that this would solve the problem in all cases. You could try to fix this by specifying (or allowing the user to specify) some relative tolerance on the exceptions, but it gets unwieldly very quickly.
Now, if instead you were to define LoopDouble in terms of a start attribute, a step attribute, and a count attribute (i.e. how many parameters to generate), all these problems would go away, except for the problem that you can't really support exceptions without facing up to the fact that tests for equality have to be approximate.
I've been just wondering. Could anyone point me to any example XPath 2.0 expression, usable in the context of this spec, that is not a valid XPath 1.0? I mean -- what's the actual reason XPath 2.0 is required by the spec?
I'm asking this question as it's really hard to find an implementation of XPath 2.0 (in non-Java world). I don't want to be stopped from implementing param sweep just because my library is not capable of doing XPath 2.0...
As I see it, there are two key challenges with LoopDouble (above and beyond those of LoopInteger):
1) Accuracy of computed values
2) Format of computed values (this wasn't discussed; I only thought of it since)
The first is an issue because it can lead to a different number of iteration steps, and it stems primarily from the fact that decimal fractions are not all exactly representable in binary arithmetic. The way to fix this is to compute the number of steps to use (probably by dividing the range by the step and rounding appropriately) and to then use integer iteration behind the scenes with an appropriate (trivial) function to convert to the floating-point values. The advantage of doing this is that it does not require the use of fancy things like decimal arithmetic packages, which are not at all universally available; while Java might have one, I'd hate to have to mandate the use of Java to process JSDL documents! IEEE double precision binary floating point arithmetic is much more common (i.e. virtually universal in hardware now). In other words, this can be fixed without schema changes.
Of more of an issue is the fact that LoopDouble does not specify how to format the numbers. That is, when it wants to generate the number half way between 0.0 and 1.0, should it actually generate the string 0.5 or 0.50 or 0.500 or 5e-1 or 500e-3 or ...? You can't use the format of the arguments since XML processors that map to an object model based on the infoset won't retain the format used but will instead focus on the logical value. (And anyway, the formats could also be inconsistent...) Either we need to specify something that is "good enough" for most uses (e.g. "a maximum of 14 significant figures, but with pointless trailing zeroes stripped so long as there is at least one digit after the decimal point; no use of scientific notation") or we need to change the schema to provide control.
Looking through the Stephen's public comment, I see that he's worried about Exception values. They're not actually an issue because we know what the step is and can therefore calculate a sensible epsilon for the "equality" test (e.g. 1% of the magnitude of the step). Perhaps the spec should make some mention of this to make implementors aware of the subtleties involved.