Introduction
Having an experience of writing API's for
Sorted from past 6-7 months I have came up with a lot of utility functions which really made my and some of my colleagues life really simple (only after they learned how to use them).
One of the important thing (rarely noticed) I came across was how to efficiently parse the request parameters (sent by client) and automatically typecast them if possible else throw error. Everytime before today, whenever I thought about it, the only fast solution I came up was to manually typecast values.
So this post is regarding a basic (though efficient) pythonic solution to automatically parse and typecast variables according to the rule provided.
Thought process
-
The task was divided into two parts:
- The rule construct (detail).
- Parser, typecasting and error throwing error if needed (detail).
-
I was targetting a small subset of problem, in which I have to typecast values into primitive datatypes like
int, float, string and boolean
-
Every API request had its own structure of request, so rule should change based on the API. Hence the rule construct should be as generic as possible to accomodate most of the possible use cases.
Rule Construct
Let us look at an example rule:
student = {
'name': 'str',
'subjects': [{
'name': 'str',
'marks': 'float',
'passed': 'bool'
}],
'hobbies': ['str'],
'school': {
'name': 'str',
'estd': 'int'
}
}
The above rule states that:
student.name is to be a string
-
student.subjects is to be a list of dict each having:
name as a string
marks as a decimal (floating point) number
passed as a boolean
student.hobbies is to be a list of strings
student.school.name is to be a string
student.school.estd is to be an integer value
Note that rule which corresponds to list has only one element irrespective of number of data that would be stored in it. So rule which corresponds to list should have homogeneous elements stored in them.
Using only 4 keywords (i.e.
int,
float,
bool and
str) along with
dict and
list construct of python rule was designed. I hope the construction of rules are clear. So moving on to the tough section in which we parse, typecast and optionally throw error (if required) the given input according to the rule.
Code Section (Parser)
from copy import deepcopy
'''
To check if type(var) passed as `x` is a string
or not (either of type `str` or `unicode`)
'''
_isStrType = lambda x: x == type('') or x == type(u'')
'''
To check if the type(var) passed as `attrType`
is one of the primitive datatype (as mentioned
above)
'''
_isBasicType = lambda attrType: (
_isStrType(attrType)
or attrType == type(1)
or attrType == type(1.0)
or attrType == type(True)
)
'''
A utility function to show a formatted
message of datatype parsing error to user.
'''
_msgFormatter = lambda chunkAttrVal, ruleAttrVal: (
str(chunkAttrVal) +
" found " + str(type(chunkAttrVal)) +
" but api expected " +
(
str(type(ruleAttrVal))
if ruleAttrVal not in ['int', 'str', 'bool', 'float']
else str(ruleAttrVal)
)
)
'''
Returns `val` after typecasting it to the
primitive types as asked in the argument `type`.
Throws `ValueError` Exception if any typecasting
error occurs
'''
def _typecast(val, type):
val = deepcopy(val)
if type == 'str':
return str(val)
elif type == 'float':
return float(val)
elif type == 'int':
return int(val)
elif type == 'bool':
val = str(val)
if val.lower() == 'true':
return True
elif val.lower() == 'false':
return False
else:
raise ValueError(
"Invalid argument for boolean type : " + val
)
'''
The recursive function which traverse all the dict
attributes and list elements and typecast each of
them if it is provided as per the rule provided.
Throws `Exception` if any parsing/typecasting error occurs
'''
def parseInputParams(chunk, rulesChunk):
# Used to update the typecasted value of chunk
# when it is a list
index = 0
# for identifying whether chunk is list/dict
if type(chunk) == type([]) or type(chunk) == type(()):
isChunkList = True
else:
isChunkList = False
# for identifying whether rulesChunk is list/dict
if type(rulesChunk) == type([]) or type(rulesChunk) == type(()):
isRulesChunkList = True
else:
isRulesChunkList = False
if isChunkList != isRulesChunkList:
msg = _msgFormatter(chunkAttrVal = chunk, ruleAttrVal = rulesChunk)
raise Exception(msg)
else:
isList = isChunkList
# isList = True, means both `chunk` and `rulesChunk` are list
# isList = False, mean `chunk` is a dict but `rulesChunk` can
# be dict or any other primtive types
# Start Iteration over all the elements of the chunk
for attr in chunk:
# get the value which is to be parsed next according to if chunk is a list or dict
if isList == True:
chunkValue = attr
ruleChunkValue = rulesChunk[0] # since the rule is depicted by the 0th element of the rule list as explained above
else:
chunkValue = chunk[attr]
if type(rulesChunk) == type({ }) and attr in rulesChunk:
ruleChunkValue = rulesChunk[attr]
else:
# If `rulesChunk` is neither a list nor dict
# containing attr, then the rule for the given
# attr is not defined and hence it should be `None`
ruleChunkValue = None
chunkValueType = type(chunkValue)
rulesChunkValueType = type(ruleChunkValue)
# Just to make `tuple` type to `list` type for easy comparison
if chunkValueType == type(()):
chunkValueType = type([])
if rulesChunkValueType == type(()):
rulesChunkValueType = type([])
if _isBasicType(chunkValueType) and _isBasicType(rulesChunkValueType):
# the `chunkValue` is to be typecasted
if isList == True:
try:
chunk[index] = _typecast(val = chunkValue, type = ruleChunkValue)
except ValueError as e:
raise Exception(e)
else:
try:
chunk[attr] = _typecast(val = chunkValue, type = ruleChunkValue)
except ValueError as e:
raise Exception(e)
elif chunkValueType == rulesChunkValueType:
# the `chunkValue` is not of primitive datatype and so is the `ruleChunkValue`
# call the same function with the subset of data to be typecasted and subset of the rule applicable
parseInputParams(chunk = chunkValue, rulesChunk = ruleChunkValue)
else:
# there is some error in the chunk provided as none of the valid condition matches.
msg = _msgFormatter(chunkAttrVal = chunkValue, ruleAttrVal = ruleChunkValue)
raise Exception(msg)
index += 1
return chunk
Having parser in place, lets now see how would we call the function with the given
student rule.
exampleRequest = {
'name': "Gautam",
'subjects': [{
'name': 'Programming',
'marks': '1',
'passed': 'false'
}, {
'name': 'Photography',
'marks': '100',
'passed': 'true'
}],
'hobbies': [29, 'photography', 'coding'],
'school': {
'name': 'School X',
'estd': '1993'
}
}
try:
parseInputParams(chunk = exampleRequest, rulesChunk = student)
except Exception as ex:
print ex
The above code works fine with no errors printed on console. Let's try another example and see when the parser would throw an error.
wrongExampleRequest = {
'name': "Gautam",
'subjects': [{
'name': 'Programming',
'marks': 'F',
'passed': 'false'
}, {
'name': 'Photography',
'marks': '100',
'passed': 'true'
}],
'hobbies': [29, 'photography', 'coding'],
'school': {
'name': 'School X',
'estd': '1993'
}
}
try:
parseInputParams(chunk = wrongExampleRequest, rulesChunk = student)
except Exception as ex:
print ex
Above mentioned example throws an error on console (explore yourself why!)
Notes
There are few things that you must take care while using the module:
-
There can be a situation in which the request param might not contain cetain attribute(s) whose rule has been defined. This will go off uncaught so please be careful.
-
It can handle a valid
dict/list/tuple request data if rule is written with caution and correctly.
-
This is the first version of the basic automated rule based typecaster in python. Soon I will upload more advanced typecaster including implementation in other loosely typed programming language.
Wordpress Version